Uncertainty Quantification and Propagation for Airline Disruption Management
UUncertainty Quantification and Propagation for AirlineDisruption Management (cid:63)
Kolawole Ogunsina ∗ , Marios Papamichalis , Daniel DeLaurentis Abstract
Disruption management during the airline scheduling process can be compart-mentalized into proactive and reactive processes depending upon the time ofschedule execution. The state of the art for decision-making in airline disrup-tion management involves a heuristic human-centric approach that does notcategorically study uncertainty in proactive and reactive processes for manag-ing airline schedule disruptions. Hence, this paper introduces an uncertaintytransfer function model (UTFM) framework that characterizes uncertainty forproactive airline disruption management before schedule execution, reactive air-line disruption management during schedule execution, and proactive airlinedisruption management after schedule execution to enable the construction ofquantitative tools that can allow an intelligent agent to rationalize complexinteractions and procedures for robust airline disruption management. Specifi-cally, we use historical scheduling and operations data from a major U.S. airlineto facilitate the development and assessment of the UTFM, defined by hid-den Markov models (a special class of probabilistic graphical models) that canefficiently perform pattern learning and inference on portions of large data sets.We employ the UTFM to assess two independent and separately disrupted (cid:63)
This article represents sections of a chapter from the corresponding author’s completeddoctoral dissertation. ∗ Corresponding Author
Email addresses: [email protected] (Kolawole Ogunsina), [email protected] (Marios Papamichalis), [email protected] (DanielDeLaurentis) School of Aeronautics and Astronautics, Purdue University, United States. Department of Statistics, Purdue University, United States. School of Aeronautics and Astronautics, Purdue University, United States.
Preprint submitted to Engineering Applications of Artificial Intelligence February 11, 2021 a r X i v : . [ c s . A I] F e b ight legs from the airline route network. Assessment of a flight leg from Dal-las to Houston, disrupted by air traffic control hold for bad weather at Dallas,revealed that proactive disruption management for turnaround in Dallas beforeschedule execution is impractical because of zero transition probability betweenturnaround and taxi-out. Assessment of another flight leg from Chicago toBoston, disrupted by air traffic control hold for bad weather at Boston, showedthat proactive disruption management before schedule execution is possible be-cause of non-zero state transition probabilities at all phases of flight operation. Keywords: airline disruption management, probabilistic graphical models,hidden Markov models, intelligent systems
1. Introduction
Airlines try to maximize profit (or minimize loss) by solving problems thatarise during the scheduling process shown in Fig. 1. The scheduling processrepresents a paramount long-term and short-term planning mechanism of ev-ery airline, wherein resources (i.e. aircraft and crew) available to an airlineare paired with a certain amount of passenger demand for air travel (Grosche,2009) that effectively define three interdependent problem dimensions: aircraft,crew, and passenger (Kohl et al., 2007). A typical airline schedule is the prin-cipal outcome of the airline scheduling process that reveals the flights offeredto customers on a particular day of operation. This schedule includes assignedaircraft types, departure and arrival airports, and time of day details on howthe operation of each flight unfolds from turnaround at the departure airportto aircraft gate-parking at the destination airport.
Irregular operations (IROPS) are prompted by disruptive events that arelikely to abound during the execution phase of the airline scheduling processdepicted in Fig. 1. These events which include inclement weather, equipmentmalfunction, and crew unavailability are most detrimental to efficiently com-pleting airline schedule on the day of operation, because most airlines are often2 igure 1: The airline scheduling process forced to delay (or cancel) flights in order to preserve the optimal schedule ob-tained prior to disruption (Ball et al., 2006). A disruption is defined as a stateduring the execution of an otherwise optimal schedule, where the deviation fromthe schedule is sufficiently large that it has to be substantially changed (Galaske& Anderl, 2016). Airlines try to minimize unexpected costs due to disruptions(IROPS) on the day of operation by solving problems that arise during disrup-tion management, through a few widely-accepted rules-of-thumb implementedby human specialists in the Airline Operations Control Center (AOCC). Recentstudies have revealed that disruptions yield an increased total annual operatingcost of about three to five percent of the airline revenue, and airline profitswould more than double if these disruptions disappeared (Amadeus IT Group,2016; Gershkoff, 2016; Nathans, 2015; Sousa et al., 2015). Hence, airline dis-ruption management is the process of solving problems related to aircraft, crewand passengers when a significant deviation from the optimal schedule obtainedprior to execution occurs during schedule execution on the day of operation.In that regard, reactive disruption management during schedule execution typi-cally begins when airline scheduling for proactive disruption management priorto schedule execution ends.
From a statistical perspective, the main objective of disruption managementis to eradicate the functional impact of aleatoric uncertainty (Fox & Ulkumen,2011) that stems from random occurrence of disruptive events like inclement3eather on optimal schedule execution on the day of operation. However, thestate of the art for attaining the primary objective of airline disruption man-agement introduces epistemic uncertainty in resolving disruptions at each phaseof flight when human specialists, with different experience levels and perspec-tives, are required to make decisions that will affect the disruption resolutionimplemented in a subsequent flight phase. Although existing approaches forairline disruption management are capable of mitigating the effect of aleatoricuncertainty on scheduled flight operations, they are limited by the incapacity toexplicitly address epistemic uncertainty and its impact on the quality of resolu-tions applied for schedule recovery and disruption management. Advancementsin machine learning techniques and big data analysis (Bishop, 2006; C. E. Ras-mussen & Williams, 2006; Koller & Friedman, 2009), coupled with cost-efficientcomputational data storage platforms (Pomerol, 1997), have presented an av-enue for the development and assessment of predictive and prescriptive modelsto facilitate the exploration of new approaches for airline disruption manage-ment that addresses the drawback of the status quo.Hence, we offer a robust approach that utilizes historical airline data ondifferent rules-of-thumb employed by human specialists in the AOCC togetherwith current practices in airline schedule operations and recovery, to effectivelyquantify and minimize the propagation of epistemic uncertainty in decision-making for disruption management.
The contributions of our research are as follows:1. We introduce an innovative uncertainty transfer function model (UTFM)architecture for concurrently tracking and assessing schedule recovery progressand decision-making during airline disruption management. The UTFMarchitecture relies on a relational dynamic Bayesian network (RDBN) fordefining the interaction between the decision-making behavior of a charac-teristic human specialist (intelligent agent) in the AOCC and schedule evo-lution during disruption management. Thus, we integrate principles from4iterature on predictive analytics into literature and practices from airlineschedule recovery, to enable uncertainty quantification for autonomousdecision-making during disruption management.2. We implement a data-driven approach for executing the UTFM architec-ture for decision-making in airline disruption management as probabilis-tic graphical models. The approach utilizes historical data on scheduleand operations recovery from a major U.S. airline to develop probabilis-tic graphical models for concurrently predicting the most likely sequenceof actions for deviation from original schedule (i.e. scheduling activities)and corresponding decisions (i.e. corrective actions) enacted for disruptionmanagement during irregular airline operations.3. We apply the UTFM architecture to provide an assessment of uncertaintypropagation from schedule execution to schedule completion for real-worldirregular schedule operations induced by weather-related disruptions. Theassessment of specific real-world irregular operations on two busy routesfrom a major U.S. airline network revealed that decisions that resulted insignificant deviations (due to bad weather) from the original schedule aremost likely to occur during phases of flight operation where the aircraft ison the ground.
We review the literature on airline schedule recovery and predictive analyt-ics in Section 2. Next in Section 3, we describe our UTFM architecture anddiscuss the data-driven and unsupervised learning approach for assembling therelational architecture by way of probabilistic graphical models. Section 4 de-scribes our computational setup for achieving high fidelity probabilistic graph-ical models, while Section 5 reports our results from the evaluation of actualweather-disrupted schedules from a U.S. airline by applying the UTFM archi-tecture. We conclude with our findings and areas for further research in Section6. 5 . Current Practice and Literature
This section provides a background of literature on airline schedule recoveryduring disruption management and literature on principles of predictive analyt-ics, and how they provide suitable mechanisms to tackle existing problems inairline disruption management.
Figure 2: Current practice in airline disruption management (Castro et al., 2014)
The Airline Operations Control Center (AOCC) typically addresses irregularoperations in a sequential manner, such that issues related to the aircraft fleet,crew members, and passengers are resolved respectively, in that order, by theircorresponding human specialists (Barnhart, 2009). This chronological resolutionprocess, depicted in Fig. 2, is characterized by phases of flight operation (Midkiffet al., 2004) where human specialists stationed in the AOCC proactively monitorand mitigate problems and disruptions related to aircraft, crew members, andpassengers in the airline network during schedule execution on day of operation.Castro et al. (2014) expressed in their work that different resolution paradigmsused for airline disruption management can be categorized based upon the prob-lem dimensions that can be readily recoverable during irregular operations.They analyzed sixty compelling research works in airline schedule recovery pub-lished between 1984 and 2014, and their findings reveal a predominance of classesof literature on solution paradigms for primarily resolving aircraft and crewdimensions (i.e. aircraft recovery and crew recovery). Compared to aircraftrecovery and crew recovery, there has been relatively few published research6n the integrated and simultaneous recovery of aircraft, crew, and passengerdimensions.While only a few decision support systems exist that jointly address two ormore problem dimensions without the need for additional assessments by humanspecialists at each solution phase of the current recovery practice (i.e. integratedrecovery), the underlying framework for developing these decision support sys-tems and other support systems used for the airline scheduling process as awhole are monolithic, primarily based upon operations research (OR) methods(i.e. explicit optimization of goal functions in time-space networks), and oftendeterministic (Barbati et al., 2012; Clarke, 1998; Marla et al., 2017; Rosenbergeret al., 2000). As such, adding supplemental features to the existing airline dis-ruption management framework significantly increases model complexity andthe computational time needed to effectively recover a disrupted airline sched-ule. In addition, many existing decision support systems for airline disruptionmanagement are unable to simultaneously address all the problem dimensionsin airline operations, while recovering the original airline schedule, partly dueto the propagation and evolution of disruptions during the operations recoveryprocess (Lee et al., 2018).A collaboration between the Amadeus IT group and Travel Technology Re-search Limited (two major IT corporations in the global travel industry) re-cently established that limited bandwidth of human specialists has significantlycontributed to the lack of progress in developing complete and effective solu-tions to airline disruption management (Amadeus IT Group, 2016). Severalkey decisions at each phase of the recovery practice shown in Fig. 2, such ascorrective actions implemented for a certain disruption type, are made by hu-man specialists. Human specialists are flexible in decision-making, but they arenot capable of accurately analyzing copious amounts of data necessary for con-current real-time decision-making for all problem dimensions during scheduleand operations recovery. Adding more personnel to the AOCC does not effec-tively increase human bandwidth, especially for major airlines, as network sizeincreases (Amadeus IT Group, 2016). To this end, our work focuses on adopt-7ng principles from machine learning and data-driven predictive techniques fordeveloping an architecture for robust airline disruption management. The pro-posed architecture enables an efficient utilization of available historical airlinedata for simultaneous schedule and operations recovery of all problem dimen-sions (i.e. simultaneously-integrated recovery).
Castro et al. (2014) introduced and demonstrated the first and only pub-lished application of principles from predictive analytics in airline disruptionmanagement that enables simultaneously-integrated recovery of all problem di-mensions. For an optimal schedule recovery plan, the authors use a multi-agentsystem design paradigm to define a model-free interaction among functionalroles in the AOCC, which enabled intelligent agents to negotiate the best util-ity for their respective problem dimensions through the Generic Q-Negotiation(GQN) reinforcement learning algorithm (Watkins & Dayan, 1992). AlthoughCastro et al. (2014) provide a qualitative and quantitative framework for dis-cerning and modeling adaptive decision-making for airline disruption manage-ment, their approach is statistically inefficient because the model-free environ-ment, wherein intelligent agents interact through reinforcement learning (Dayan& Niv, 2008), does not employ (or estimate) a predefined flight schedule andoperations model consistent with airline scheduling practices to obtain opti-mal disruption resolutions during airline schedule recovery. As a result, theirapproach requires considerable trial-and-error experience to obtain acceptableestimates of future consequences from adopting specific resolutions during air-line disruption management. In contrast with the work by Castro et al. (2014),our proposed framework leverages real-world historical data to eliminate thenecessity of trial-and-error experience for facilitating simultaneously-integratedrecovery during airline disruption management.Thus, to summarize, this paper enhances prior literature on simultaneously-integrated recovery in two major ways:8. We adeptly use experience (i.e. historical data on airline schedule and op-erations recovery) to construct an internal model of the transitions and im-mediate outcomes of scheduling activities and decisions for different phasesof flight operations, by effectively describing the model environment as arelational dynamic Bayesian network architecture. The architecture de-fines the interaction between schedule changes and decision-making duringairline disruption management, for a unique intelligent agent in a multi-agent system.2. We provide a modular approach for implementing an uncertainty transferfunction model for disruption management. The approach inculcates fea-ture engineering and probabilistic graphical modeling methods that enablethe use of appropriate machine learning algorithms to effectively calibrateparameters for a relational dynamic Bayesian network architecture.
3. The Uncertainty Transfer Function Model
The debilitating effect of disruptions on the optimal execution of a scheduledrevenue flight becomes more pronounced with increasing number of flight legs(Gershkoff, 2016). According to the International Air Transport Association(IATA), a scheduled revenue flight is any flight schedule executed by an airlinefor commercial remuneration according to a published timetable, and each flightleg in a scheduled revenue flight represents an aircraft’s trip from one airportto another airport without any intermediate stops.
Figure 3: Disruption management for a scheduled flight defined by a Markov decision process
Every flight leg in a scheduled flight is defined by phases of aircraft ac-tivity (or flight phases) that are influenced by the decision-making activitiesof multiple air transportation stakeholders as the aircraft journeys between air-ports. For an airline, human specialists located in the AOCC perform important9ecision-making activities at respective flight phases during each flight leg in ascheduled flight, where actions implemented during the most precedent flightleg influence the changes in schedule and decisions made in subsequent flightlegs. Thus, fundamentally, the decision-making process for managing disrup-tions in a scheduled flight adheres to the Markov property Frydenberg (1990),as illustrated in Fig. 3. Congruently, schedule changes and decisions at a futureflight phase (conditional on both past and present flight phases) during a flightleg are strictly dependent on the schedule changes and decisions made for miti-gating irregular operations in the present flight phase, and not on the sequenceof schedule changes and decisions made during the flight phases that precededit.
We formulate our UTFM framework for airline disruption management as arelational dynamic Bayesian network (RDBN) (Friedman et al., 1999; Getoor &Taskar, 2007; Sanghai et al., 2005) wherein the modeling domain is defined as anairline route network containing multiple related flight schedules that are rou-tinely executed and randomly disrupted over a certain time frame. The RDBNarchitecture provides a generative modeling approach that defines a probabil-ity distribution over instances of scheduled (disrupted) flights in an airline routenetwork. By employing data features (attributes) that provide a logical descrip-tion of airline activities for disruption management coupled with probabilisticgraphical model templates (schema), the RDBN architecture defines the prob-abilistic dependencies in a domain across two time slices. Thus, for our RDBNarchitecture, the following general and interrelated definitions apply (Koller &Friedman, 2009; Neville & Jensen, 2007; Sanghai et al., 2005):
Definition 1 ( Dynamic Relational Domain ) Syntax : A term represents any flight phase, flight leg, or flight schedule inan airline route network domain . A predicate represents any concatenation ofattributes or activities for any term in the domain. The dynamic relational domain is the set of constants, variables, func-tions, terms, predicates and atomic formulas Q ( r , ..., r n , t ) that define anairline route network, such that each argument r i is a term and t is thetime step during disruption management. • The set of all possible ground predicates at time t is determined by substi-tuting the variables in a low-level schema of each argument with constantsand substituting the functions in a high-level schema of each argumentwith resulting constants. Semantics : The state of an airline route network domain at time t duringdisruption management is the set of ground predicates that are most likely attime t . Assumptions : • The dependencies in an airline route network domain are first-order Markovsuch that ground predicates at time t can only depend on the ground pred-icates at time t or t − . • A grounding (i.e. referential learning or decoding process) in an airlineroute network domain at time t − precedes a grounding at time t , suchthat this assumption takes priority over the ordering between predicates inthe domain. Q ( r , ..., r n , t ) ≺ Q ( r (cid:48) , ..., r (cid:48) m , t (cid:48) ) if t < t (cid:48) Definition 2 ( Two-time-slice relational dynamic Bayesian network: 2-TRDBN ) Syntax : The 2-TRDBN is any graph (or schema) that provides a probabilitydistribution on the state of an airline route network domain at time t + 1 giventhe state of the domain at time t . Semantics : For any predicate Q bounded by groundings at time t , we have : • A set of parents
P a ( Q ) = { P a , ..., P a l } , such that each P a i is a predicateat time t − or t . A conditional probability model for P ( Q | P a ( Q )) , which is a first-orderprobability tree (or a trellis) on the parent predicates. Assumptions : • If P a i is at time t , then P a i ≺ Q or P a i = Q . • If P a i = Q , then its groundings are bounded to those that precede thedefined grounding of Q . Definition 3 ( Relational Dynamic Bayesian Network: RDBN ) Syntax : A RDBN for disruption management is any network pair ( N (cid:48) , N → ) ,such that N (cid:48) is a dynamic Bayesian network (DBN) at time t = 0 and N → is a2-TRDBN. Semantics : N (cid:48) characterizes the probability distribution over a relational (air-line route network) domain prior to schedule execution (i.e. at t = 0 ). Giventhe state of the relational domain at a time t during disruption management (orschedule execution), N → represents the transition probability distribution on thestate of the domain at time t + 1. Assumptions : A term (node) is created for every ground predicate and edgesare added between a predicate and its parents at a time t > . • Parents are obtained from N (cid:48) if t = 0 , else from N → . • The conditional probability distribution for each term is defined by a prob-abilistic graphical model bounded by a specific grounding of the predicate.
For the purposes of uncertainty quantification and propagation discussed inthis paper, we adapt the aforementioned definitions for a RDBN to constructa UTFM, such that the modeling domain is for a representative flight leg thatis defined by the probabilistic graphical model (i.e. atomic formula) illustratedby Fig. 4. The flight leg operation sequence (i.e. disruption progression alonghorizontal axis in Fig. 4) represents the spatiotemporal axis in a multidimen-sional Markov chain (Ching et al., 2008) that describes the order in which (orwhen) random disruptions (i.e. indeterminate features for bad weather events)12 igure 4: RDBN architecture for a representative flight leg occur during different phases of flight. As such, the flight leg operation sequencedefines the propagation of aleatoric uncertainty in schedule and operations re-covery. The schedule evolution sequence (i.e. schedule-planning evolution alongthe vertical axis in Fig. 4) captures epistemic uncertainty in decision-makingfor operations recovery by characterizing the order in which (or how) the flightschedule changes with respect to disruption resolutions, such as rules-of-thumb or decision features like delay period, are applied by human specialists on theday of operation. Scheduled events constitute data features (such as departuretimes, arrival times, aircraft type, etc.) that define the optimal airline (flight)schedule for m different flight phases prior to schedule execution.Furthermore, scheduled events serve as start points in the UTFM architec-ture and may also inform the decision-making of human specialists during theresolution of a specific type of disruption. Unscheduled events represent anupdated set of data features that characterize the adjustment of optimal flightschedule by human specialists based upon the impact of disruption at m dif-ferent flight phases during schedule execution. Unscheduled events provide endpoints in the UTFM architecture. Schedule feature states (labeled S in Fig. 4)13epresent functions of data items that are strictly subject to uncertainty in de-terminate data features with respect to airline planning and scheduling priorto schedule execution. Decision feature states (labeled D in Fig. 4) representfunctions of action items that human specialists implement during schedule ex-ecution to resolve disruptions in the optimal schedule obtained prior to scheduleexecution (e.g. delay time, flight swap flag, etc.), while outcome feature states(labeled O in Fig. 4) represent functions of data items that human special-ists use to assess the impact of their decisions after resolving deviations fromthe optimal schedule obtained prior to schedule execution. The parametersfor S, D, O, α, β, γ, κ, λ in Fig. 4 are obtained by grounding via hidden Markovmodels, to determine the schedule evolution and decision-making proclivitiesof human specialists at each flight phase during disruption management for acharacteristic flight leg. Refer to algorithms in Appendix D, Appendix E,Appendix F, and Appendix G for more information on UTFM grounding.
Prior to learning and assembling the UTFM to enable the prediction ofuncertainty propagation patterns during airline disruption management, it isimperative to understand the nature of the airline data set that will be used todevelop constituent models. By following the atomic formula from Fig. 4 andappraising a raw data set defined by over 40 separate data features provided bya major U.S. airline, this section describes the methods used to abstract andencode different data features in the data set to achieve high fidelity probabilisticgraphical models.
We apply a combination of event abstraction and uncertainty abstractionprinciples (Ogunsina et al., 2021) to establish three separate classes of data fea-tures for uncertainty quantification and propagation in airline disruption man-agement, namely: 14.
Determinate aleatoric features : These are flight schedule features that aresubject to the least possible uncertainty for the risk of alteration dur-ing irregular operations for disruption management, based upon inherentrandomness of disruptive events. For instance, longitude and latitude co-ordinates that provide specific geographical information for origin and des-tination stations are always assumed to remain unchanged, by the AOCC,during the recovery of a delayed flight schedule.2.
Indeterminate aleatoric features : These are data features that are subjectto the most possible uncertainty for the risk of instantiating irregular op-erations during schedule execution, due to inherent randomness of disrup-tion events. Examples include IATA delay codes that indicate inclementweather at a particular airport, which may require a human specialist inthe AOCC to delay the departure of a specific flight at the (origin) air-port and reassign some or all of its passengers to another flight with alater departure.3.
Epistemic features : These are flight schedule features that are subject tothe most possible uncertainty for the risk of alteration during irregular op-erations for disruption management, due to lack of knowledge of the exactimpact of their alteration. For instance, following a specific disruptionlike late arrival of flight crew for a scheduled flight, a human specialist inthe AOCC may choose to delay the departure of the flight by a specificperiod of time after the original departure time. However, most times, thehuman specialist can not guarantee that the decision to apply a particulardelay duration after scheduled departure will produce a specific disruptionmanagement outcome, due to the cascading effect of disruptions in largeairline networks.
Since many algorithms for learning probabilistic graphical models performbest with continuous data (Getoor & Taskar, 2007; Ogunsina et al., 2021, 2019),15 able 1: Feature Engineering and Transformation for UTFM
Raw DataClass
First-Degree
Transforma-tion
Second-Degree
Transforma-tion Refined DataType
GeographicFeatures
Sphericaldirectionalvectors, geodesicdistance Standardization Continuous
TemporalFeatures
Periodic(Sine/Cosine)vectors Standardization Continuous
CategoricalFeatures
One-hotencoding Standardization Continuous
ContinuousFeatures
N/A Standardization Continuousit is necessary to encode all values of features (or fields) in the data set into func-tional and relevant continuous data for use in appropriate algorithms (Liskov,1988; Reid Turner et al., 1999). Table 1 reveals the feature engineering methodsapplied to transform the features in a raw data set for developing and assess-ing the UTFM. As shown in Table 1, first-degree transformation representsthe conversion of different attributes that define data features into appropriatemathematical functions and values, while second-degree transformation repre-sents the conversion of data features into suitable statistical distributions basedupon the number of available flight schedules (i.e. data samples). As such,raw geographical features are converted into spherical directional vectors andgeodesic distance (T. Vincenty, 1975) while raw temporal features are convertedinto sine or cosine vectors during first-degree transformation. Categorical datafeatures in the raw data set are converted into sparse matrices during first-degree transformation through one-hot encoding (Seger, 2018). All data fea-16ures (fields) are subsequently scaled to obtain a standard normal distributionduring second-degree transformation to facilitate statistical interpretation of theresults obtained from probabilistic graphical models. A complete definition ofall the refined data features used for creating the probabilistic graphical mod-els discussed in this paper can be found in Appendix A, Appendix B, andAppendix C.
Figure 5: Component assembly approach for automatic uncertainty quantification for disrup-tion management
We use a solution technique based upon a component assembly process,which enables generative programming for probabilistic graphical models (Koller& Friedman, 2009), to calibrate (ground) the parameters of the multidimen-sional Markov chain that define the UTFM introduced in Section 3. Compo-17ent assembly is a widely espoused modeling paradigm in computer scienceand software engineering (Cao et al., 2005), and facilitates the integration ofstate components of the UTFM that define separate phases of flight operationand schedule-recovery evolution in the UTFM architecture. Through generativeprogramming (Chase, 2005; Czarnecki, 2005), highly customized and optimizedintermediate parameters defining each state component and aggregate UTFMparameters, can be created on demand from elementary and reusable parame-ters of state components, through a priori knowledge of the graph structure ofthe Markov system.Fig. 5 reveals our solution approach to automatic uncertainty quantificationfor airline disruption management. The approach starts by abstracting histor-ical airline schedule and operations recovery data into a digestible data set,via the methods described in Section 3.2, applicable to algorithms for predic-tive analytics. Next, the refined data set is used to learn optimal probabilisticgraphical model parameters of each state component of the UTFM, before con-structing an overarching probabilistic graphical model from the aggregation ofthe respective optimized probabilistic graphical models of state components.For the remainder of this section, we introduce probabilistic graphical model-ing and discuss the role of hidden Markov models for grounding (i.e. calibratingthe parameters) in a probabilistic graphical model representation of the UTFM.
Probabilistic graphical modeling provides an avenue for a data-driven ap-proach to constructing the UTFM architecture, which is very effective in prac-tice (Koller & Friedman, 2009). By employing rudimentary activity guidelinesfrom human specialists in the AOCC for airline disruption management, crit-ical components for constructing an intelligent system such as representation,learning, and inference can be readily inculcated in the UTFM.Fig. 6 shows the probabilistic graphical model representation of the UTFMdefined by four major phases of flight along the operation sequence axis namely:Turnaround, Taxi-Out, Enroute, and Taxi-In, while the schedule evolution se-18 igure 6: Probabilistic graphical model representation of UTFM quence axis is defined by three separate phases of schedule changes with re-spect to airline planning on day of operation namely: Schedule, Decision, andOutcome. Thus, the graph structure of the UTFM comprises of 12 distinctcomponent states (nodes) with 12 internal state transitions and 17 externalstate transitions, such that each component state contains a set of combination(interaction) of data features, listed in Section 4, that encode the behavioralproclivities of human specialists at different phases of activity during airlinedisruption management.Schedule state components (i.e., TAS, TOS, ES, TIS) in Fig. 6 representan interaction of data features that describe the evolution of original (optimal)flight schedule predetermined prior to schedule execution on day of operation,which would inform the decision-making of a human specialist in the AOCCduring schedule execution. As such, schedule state components in the UTFMencapsulate epistemic uncertainty in proactive disruption management prior toschedule execution (i.e., uncertainty in tactical disruption management). De-cision state components in the UTFM (i.e., TAD, TOD, ED, TID) define the19nteraction of data features that describe the action items that human specialistsimplement for resolving specific types of disruption that occur during scheduleexecution, and define epistemic uncertainty in reactive disruption managementduring rescheduling on day of operation (i.e., uncertainty in operational disrup-tion management). Outcome state components in Fig. 6 (i.e., TAO, TOO, EO,TIO) represent the interaction of a set of data features that characterize theoriginal schedule adjusted based upon the impact of disruption resolutions (i.e.action items) implemented by human specialists during schedule execution, andtherefore define epistemic uncertainty in proactive disruption management forfuture airline scheduling after schedule execution (i.e., uncertainty in strategicdisruption management).
The hidden Markov model (HMM), also known as a transducer-style proba-bilistic finite state machine (Vidal et al., 2005), is the simplest class of dynamicBayesian networks and a useful tool for representing probability distributionsover a sequence of observations (Ghahramani, 2001; Letham & Rudin, 2012).The hidden Markov model obtains its name from defining two separate butrelated characteristics. First, it assumes that the observation at a particularinstance in time was generated by an arbitrary process whose state is hiddenfrom the observer. Second, it assumes that the state of this hidden process sat-isfies the Markov property. To that effect, the hidden Markov model lends anappropriate grounding medium for solving the learning and inference (decoding)problems (Yang et al., 1997) for the probabilistic graphical model representationand construction of the UTFM.Mathematically, the hidden Markov model is defined as a stochastic process( X k , Y k ) k ≥ on the product state space ( E × F, E ⊗ F ) if there exist transitionkernels P : E × E → [0 ,
1] and Φ : E × F → [0 ,
1] such that E ( g ( X k +1 , Y k +1 ) | X , Y , ..., X k , Y k ) = (cid:90) g ( x, y )Φ( x, dy ) P ( X k , dx ) (1)20nd a probability measure µ on E wherein E ( g ( X , Y )) = (cid:90) g ( x, y )Φ( x, dy ) µ ( dx ) (2)for any bounded and measurable function g : E × F → R . As such, µ representsthe initial measure, P is the transition kernel, and Φ represents the observationkernel of the hidden Markov model ( X k , Y k ) k ≥ . The learning problem for construction of the UTFM is representative ofoptimizing the parameters of the pair of dynamic Bayesian networks ( N (cid:48) , N → )defined in Section 3.1 based upon available data, and therefore presents two sep-arate learning sub-problems: Intra-State
HMM learning and
Inter-State
HMMlearning. Hence,
Intra-State
HMM learning and
Inter-State
HMM learning char-acterize the grounding process for obtaining optimal parameters for N (cid:48) and N → respectively. Specifically, Intra-State
HMM learning represents the ability to ef-fectively determine appropriate interaction patterns (i.e. transition likelihood)for hidden data features (subject to epistemic uncertainty) which are embeddedin each state component of the UTFM shown in Fig. 6, based upon observingdata features (i.e. observations) that are strictly subject to uncertainty from de-terminate or indeterminate aleatoric features observed at any phase of activityduring airline disruption management. Some examples of data features that rep-resent observations for
Intra-State
HMM learning of state components in theUTFM include total distance between origin airport and destination airport,and total number of passengers (i.e. demand for air travel) available for flightbefore and after schedule execution. Thus, the primary objective of
Intra-State
HMM learning is to achieve an optimal HMM (probability distribution mixturemodel) that is capable of efficiently predicting the likelihood of remaining ata particular phase of activity (i.e. state component) in the UTFM for airlinedisruption management.
Inter-State
HMM learning, on the other hand, characterizes the ability toascertain the interaction or transition patterns between any two neighboring21tate components (phases of activity) in the UTFM, wherein data features (listedin Section 4) embedded in the state component at the future (posterior) phaseof activity in the UTFM are set as observations while data features embeddedin the state component at the current (prior) phase of activity are set as hiddenstates. As such, the primary objective of
Inter-State
HMM learning is to attainan optimal HMM (probability distribution mixture model) that is capable ofaccurately predicting the likelihood of transitioning between present and futurephases of activity (i.e. state components) in the UTFM.1. Compute Q ( θ, θ (cid:48) ) = (cid:88) z = ¯ Z log [ P ( X, z ; θ )] P ( z | X ; θ (cid:48) ) (3)2. Set θ (cid:48) +1 = arg max θ Q ( θ, θ (cid:48) ) (4)The Baum-Welch algorithm (Baum & Petrie, 2007) is a dynamic program-ming approach that uses the expectation maximization (EM) algorithm (Bilmes,2011) to find the maximum likelihood estimate of the parameters of an HMMgiven a set of observations. The Baum-Welch algorithm presents a convenientmeans for learning the optimal parameters (i.e. state transition and emission Figure 7: Intra-state HMM schema for remaining in an activity phase in UTFM
Intra-State or Inter-State
HMM, because it guarantees thatthe optimal parameters of the HMM are easily estimated in an unsupervisedmanner during training by utilizing unannotated observation data (Boussemartet al., 2012). In essence, the Baum-Welch algorithm described by steps in Equa-tions 3 and 4, where X , ¯ Z , and θ are the latent state space, observation space,and initial HMM parameters respectively, is an iterative procedure for estimat-ing θ (cid:48) until convergence, such that each iteration of the algorithm is guaranteedto increase the log-likelihood of the data. However, convergence to a globaloptimal solution is not necessarily guaranteed (Baum & Petrie, 2007).Fig. 7 reveals the general schema for learning the optimal parameters of an Intra-State
HMM. The circles and squares in Fig. 7 represent the hidden (latent)states (i.e. data features subject to epistemic uncertainty) and observations (i.e.data features which are representative of aleatoric uncertainty) respectively.The learning objective for the
Intra-State
HMM schema in Fig. 7 is to usethe Baum-Welch algorithm to find the optimal HMM parameters, which arethe solid and dashed arrows that represent state transition probabilities andemission probabilities respectively.Fig. 8 shows a generic schema for learning the optimal parameters of atypical
Inter-State
HMM, essential for predicting the likelihood of transitioning
Figure 8: Inter-state HMM schema for transitioning between activity phases in UTFM
Inter-State
HMM schema, shown in Fig. 8, represent epistemic datafeatures embedded in current activity phase A (i.e. hidden states) and futureactivity phase B (i.e. observations), respectively, in the UTFM. Similar tothe Intra-State
HMM, the learning objective for the
Inter-State
HMM schemadepicted by Fig. 8 is to use the Baum-Welch algorithm to find the optimalHMM parameters, which are the solid and dashed arrows that represent thestate transition probabilities and emission probabilities respectively.Unlike the
Intra-State
HMM schema where hidden states represent data fea-tures subject to epistemic uncertainty for disruption management and observa-tions represent data features subject to aleatoric uncertainty, both hidden statesand observations in the
Inter-State
HMM schema are representative of data fea-tures subject to epistemic uncertainty for disruption management. Thus, theoverarching objective of an optimal
Intra-State
HMM is to accurately and ex-peditiously quantify the epistemic uncertainty at a specific phase of activity inthe UTFM, while the overall objective of an optimal
Inter-State
HMM is to pre-cisely predict the propagation of epistemic uncertainty between different phasesof activity in the UTFM, for robust airline disruption management.
Upon learning optimal parameters of
Intra-State and
Inter-State hiddenMarkov models, which define proactive and reactive behavioral patterns ofhuman specialists at different stages of airline disruption management in theUTFM, it is imperative to conduct inference on the models to complete theassembly of the UTFM for effectively predicting uncertainty propagation pat-terns for airline disruption management. Similar to the learning problem, theinference problem for the assemblage of the UTFM is defined by two sepa-rate sub-problems: component
UTFM decoding and aggregate
UTFM decod-ing.
Component
UTFM decoding, defines the capacity of both
Intra-State and
Inter-State hidden Markov models for obtaining the most probable sequence of24idden (epistemic) data features in both types of HMMs, based upon (aleatoricor epistemic) observation data features necessary for decoding in their respectiveschema illustrated in Figs. 7 and 8. Thus, the primary objective of component
UTFM decoding problem is to provide the maximum likelihood estimates of themost probable sequence of hidden data features from optimal
Intra-State and
Inter-State
HMMs upon inputting appropriate observation data features.
Aggregate
UTFM decoding, on the other hand, describes the ability of theamalgamation of all
Intra-State and
Inter-State
HMMs that constitute theUTFM, to precisely estimate the quantification and propagation of epistemicuncertainty at all phases of activity in the UTFM, based upon observing themaximum likelihood estimates of the most probable sequence of hidden datafeatures retrieved from optimal
Intra-State
HMMs in the UTFM by way of component
UTFM decoding. As such, a complementary objective of aggregate
UTFM decoding problem is to obtain the parameters for
S, D, O, α, β, γ, κ, λ asshown in Fig. 4, by estimating the weighted average of the maximum likelihoodestimates of the most probable sequence of hidden data features retrieved fromall optimal
Intra-State and
Inter-State
HMMs upon observing their respectiveinput data features (i.e. observations). x ∗ = arg max x P ( z, x | θ (cid:48) ) (5)The Viterbi decoding algorithm (Forney, 1973; Viterbi, 1967) is a provendynamic programming algorithm that performs the HMM inference of the mostprobable sequence of hidden states (and its corresponding likelihood) basedupon a specific sequence of observations, ultimately solving both the compo-nent and aggregate UTFM decoding sub-problems respectively. In principle,the Viterbi decoding algorithm defined by Equation 5, where x , z , and θ (cid:48) rep-resent a sequence of hidden states, a sequence of observations, and an arbitraryHMM respectively, uses a recursive (backtracking search) procedure for obtain-ing the optimal sequence of hidden states from the total number of possiblesequences of hidden states for a specific sequence of observations, by selecting25he sequence of hidden states that has the highest probability based upon max-imum likelihood estimations from the arbitrary HMM (Forney, 1973). As such,the Viterbi decoding algorithm provides an efficient method for avoiding theexplicit enumeration of all possible combinations of sequences of hidden states(i.e. concatenations of data features) while identifying the optimal sequence(i.e. Viterbi path) of hidden states with the highest probability of occurrenceor least uncertainty (Omura, 1969).In summary, from a UTFM assemblage perspective, the underlying objectiveof component UTFM decoding is to perform inference on all optimal
Intra-State and
Inter-State
HMMs that define the UTFM, by implementing the Viterbi de-coding algorithm to effectively estimate the likelihood (Viterbi probability) ofthe most likely sequence of hidden states (data features) based upon observingappropriate data features (observations), as shown in Figs. 7 and 8. By exten-sion, the overall objective of aggregate
UTFM decoding is to apply the Viterbialgorithm for determining the most likely sequence of state components that de-scribes the propagation of epistemic uncertainty at different phases of activity inthe UTFM shown in Fig. 6. The state transition parameters of a representativeprobabilistic finite state machine for the UTFM are weighted averages of theViterbi probabilities obtained via component
UTFM decoding that satisfy theproperties of a stochastic matrix (Haggstrom, 2002).
4. Computational Setup and Analysis
We now discuss the computational framework for generating state compo-nents of the probabilistic graphical model representation of the UTFM (shownin Fig. 6), which is used to predict epistemic uncertainty propagation duringdecision-making for airline disruption management. Prior to implementing theBaum-Welch and Viterbi algorithms to learn and decode useful HMMs for de-termining authentic likelihoods of internal and external transitions amongst dif-ferent state components in the UTFM, raw historical airline data, necessary forenabling the application of algorithms for the development of these probabilistic26raphical models, is first refined by following the data abstraction and featureengineering guidelines described in Section 3.2. Following data pre-processingand refinement, models are subsequently implemented through learning and de-coding in the Python programming language and facilitated by pomegranate (Schreiber, 2016), by utilizing a 56-core workstation running at 2.60 GHz with192 GB of RAM.Table 2: List of features for
Intra-State
HMMs in UTFM.
Intra-State
HMMfor UTFM Hidden States Observations
TAS (TurnaroundSchedule)
SWAP FLT FLAG,SCHED ACFT TYPE,SCHED TURN MINS,tod sched PB
RTE, FREQ, PAXDMDTOS (Taxi-out Sched-ule) taxi out, tod actl TO,sched block mins
RTE, FREQ, PAXDMDES (Enroute Schedule) actl enroute mins,tod actl LD,sched block mins
RTE, FREQ, PAXDMDTIS (Taxi-in Schedule) taxi in, tod sched GP,sched block mins
RTE, FREQ, PAXDMDTAD (Turnaround De-cision) shiftper sched PB, AD-JST TURN MINS,DELY MIN,SWAP FLT FLAG
ORIG, DEST, FREQ,PAX DMD, DISRPTOD (Taxi-out Deci-sion) late out vs sched mins,shiftper actl PB,DELY MIN
ORIG, DEST, FREQ,PAX DMD, DISRPContinued on next page27 able 2 – continued from previous page
Intra-State
HMMfor UTFM Hidden States Observations
ED (Enroute Decision) shiftper actl TO,shiftper actl LD,DOT DELAY MINS
ORIG, DEST, FREQ,PAX DMD, DISRPTID (Taxi-in Decision)
DOT DELAY MINS,shiftper sched GP, shift-per actl GP
ORIG, DEST, FREQ,PAX DMD, DISRPTAO (Turnaround Out-come)
SWAP FLT FLAG,ACTL ACFT TYPE,ACTL TURN MINS,tod actl PB
RTE, FREQ, PAXDMDTOO (Taxi-out Out-come) taxi out, tod actl TO,actl block mins
RTE, FREQ, PAXDMDEO (Enroute Outcome) actl enroute mins,tod actl LD,actl block mins
RTE, FREQ, PAXDMDTIO (Taxi-in Outcome) taxi in, tod actl GP,actl block mins
RTE, FREQ, PAXDMD28able 3: List of features for
Inter-State
HMMs in UTFM.
Inter-State
HMMfor UTFM Hidden States Observations
TAS → TOS
SWAP FLT FLAG,SCHED ACFT TYPE,SCHED TURN MINS,tod sched PB taxi out, tod actl TO,sched block mins
TOS → ES taxi out, tod actl TO,sched block mins actl enroute mins,tod actl LD,sched block mins ES → TIS actl enroute mins,tod actl LD,sched block mins taxi in, tod sched GP,sched block mins
TAD → TOD shiftper sched PB, AD-JST TURN MINS,DELY MIN,SWAP FLT FLAG late out vs sched mins,shiftper actl PB,DELY MIN
TOD → ED late out vs sched mins,shiftper actl PB,DELY MIN shiftper actl TO,shiftper actl LD,DOT DELAY MINSED → TID shiftper actl TO,shiftper actl LD,DOT DELAY MINS DOT DELAY MINS,shiftper sched GP,shiftper actl GP
TAO → TOO
SWAP FLT FLAG,ACTL ACFT TYPE,ACTL TURN MINS,tod actl PB taxi out, tod actl TO,actl block mins
Continued on next page29 able 3 – continued from previous page
Inter-State
HMMfor UTFM Hidden States Observations
TOO → EO taxi out, tod actl TO,actl block mins actl enroute mins,tod actl LD,actl block minsEO → TIO actl enroute mins,tod actl LD,actl block mins taxi in, tod actl GP,actl block mins
TAS → TAD
SWAP FLT FLAG,SCHED ACFT TYPE,SCHED TURN MINS,tod sched PB shiftper sched PB, AD-JST TURN MINS,DELY MIN,SWAP FLT FLAG
TOS → TOD taxi out, tod actl TO,sched block mins late out vs sched mins,shiftper actl PB,DELY MIN ES → ED actl enroute mins,tod actl LD,sched block mins shiftper actl TO,shiftper actl LD,DOT DELAY MINS TIS → TID taxi in, tod sched GP,sched block mins DOT DELAY MINS,shiftper sched GP,shiftper actl GP
TAD → TAO shiftper sched PB, AD-JST TURN MINS,DELY MIN,SWAP FLT FLAG SWAP FLT FLAG,ACTL ACFT TYPE,ACTL TURN MINS,tod actl PB
TOD → TOO late out vs sched mins,shiftper actl PB,DELY MIN taxi out, tod actl TO,actl block mins
Continued on next page30 able 3 – continued from previous page
Inter-State
HMMfor UTFM Hidden States Observations ED → EO shiftper actl TO,shiftper actl LD,DOT DELAY MINS actl enroute mins,tod actl LD,actl block mins TID → TIO
DOT DELAY MINS,shiftper sched GP, shift-per actl GP taxi in, tod actl GP,actl block mins4.1. UTFM Input and Output Features
Table 2 and Table 3 reveal the hidden states (latent output data features)and observations (observed input data features) for all
Intra-State and
Inter-State
HMMs, respectively, that constitute an aggregate HMM which definesthe UTFM. The selection of specific hidden and observation data features, forall
Intra-State and
Inter-State
HMMs that define the UTFM, was informedpartly by literature (Clarke, 1998; Hao & Hansen, 2013; Midkiff et al., 2004),exploratory data analysis (Ogunsina et al., 2021), and partly by discussions withhuman experts at the AOCC of the U.S. airline that provided the raw historicaldata. We adopted this hybrid feature selection approach to ensure that datafeatures which are appropriately relevant at a specific phase of activity in theUTFM are parameters of the corresponding HMM that represents that phaseof activity for airline disruption management.For
Intra-State
HMMs listed in Table 2, observations (i.e. observed aleatoricdata features) are defined by data features that are strictly subject to aleatoricuncertainty with respect to how often they are considered, by human special-ists, in order to attain optimal schedules during the airline scheduling processshown in Fig. 1. Therefore, observations for
Intra-State
HMMs, listed in Table2, include data features that represent the following: origin airport location31nd flight origin (ORIG), destination airport location (DEST), flight operatingperiod in a calendar year (FREQ), route distance between origin and destina-tion airports (RTE), number of passengers available for flight (PAX DMD), andrandom disruption types such as inclement weather (DISRP). ORIG, DEST,FREQ, RTE, and PAX DMD represent determinate aleatoric features that aredetermined by the airline, which are subject to aleatoric uncertainty at all phasesof activity in the UTFM. As such, these features are indicative of the uniquenessof a particular flight schedule with respect to the airline route network. DISRPrepresents indeterminate aleatoric features that are subject to uncertainty whichcan not be readily controlled by an airline, and thus represent pure aleatory inairline disruption management. Hidden states (i.e. epistemic output data fea-tures) for
Intra-State
HMMs represent data features that are strictly subject toepistemic uncertainty with respect to the concatenation (interaction) of latentdata features with the highest probability of occurrence, which indicate the ac-tivity patterns of human specialists (i.e. decision-making) for attaining optimalschedules during the airline scheduling process.For
Inter-State
HMMs listed in Table 3, observations (i.e. observed epistemicdata features) represent data features that are strictly subject to epistemic un-certainty with respect to the Viterbi probability (i.e. probability of the mostlikely sequence of latent data features estimated by an
Intra-State
HMM) atan immediate future phase of activity in the UTFM, while hidden states (i.e.latent epistemic data features) represent data features whose concatenations arestrictly subject to epistemic uncertainty with respect to the Viterbi probabilityestimated by a characteristic
Intra-State
HMM in the present phase of activityin the UTFM during airline schedule planning and disruption management.
Fig. 9 reveals a one-dimensional spatiotemporal representation of the UTFMreduced along the operation sequence axis (i.e. arbitrary column in Fig. 4). Yel-low plates, indicated by SCHD FEAT, DESN FEAT, and OUT FEAT in Fig. 9,32 igure 9: Phases of disruption management with respect to schedule execution are representative of epistemic data features which define separate hidden statesfor
Intra-State
HMMs at each phase of flight operation along the operation se-quence axis (i.e. Turnaround, Taxi-Out, Enroute, and Taxi-In) in the UTFMdetailed in Fig. 6. In that regard, SCHD FEAT represents data features that de-fine hidden states for TAS, TOS, ES, and TIS
Intra-State
HMMs in the UTFM;DESN FEAT represents data features that define hidden states for TAD, TOD,ED, and TID
Intra-State
HMMs, while OUT FEAT is representative of datafeatures that define hidden states for TAO, TOO, EO, and TIO states in theUTFM. Green and red plates in Fig. 9 are representative of uncertainty fromdeterminate and indeterminate and aleatoric features for disruption manage-ment respectively, which define observations (inputs) for all
Intra-State
HMMsin the UTFM. 33 .2.2. Data Segmentation for Learning
We employ the two separate lots of data in the full data set, defined as the non-disrupted and disrupted data sets, to learn optimal parameters of all HMMsthat define different phases of activity for disruption management in the UTFM.The non-disrupted data set contains six hundred and twenty thousand instancesof flight schedules in the airline network that executed as originally plannedbetween September 2016 and September 2017. As such, the non-disrupted dataset contains appropriate latent (hidden) and observation data features for flightschedules that executed without any uncertainty from indeterminate aleatoricfeatures (i.e. random disruption features). Thus, we use the non-disrupted dataset to calibrate
Intra-State
HMMs that define the tactical and strategic (i.e.Schedule and Outcome) phases of activity for disruption management in theUTFM. Unlike the non-disrupted data set, the disrupted data set contains allinstances of flight schedules that executed through irregular operations due todelays in the airline route network from September 2016 to September 2017.Hence, the disrupted data set comprises of instances of flight schedules thatexecuted with uncertainty from indeterminate aleatoric features over a one yearperiod for separate functional roles in the AOCC.Therefore, we conduct
Intra-State
HMM learning for operational disruptionmanagement (i.e. Decision activity phases in the UTFM) by utilizing the dis-rupted data set. Similarly, we also utilize the disrupted data set to learn theoptimal parameters of all
Inter-State
HMMs along the operation sequence andschedule-change sequence axes in the UTFM for separate functional roles in theAOCC. To demonstrate the application of the UTFM in this paper, we onlyconsider disruptions due to weather-related events, because irregular operationsdue to weather disruptions affect all problem dimensions during airline disrup-tion management. As such, the non-disrupted data set is used to calibrate the
Intra-State
HMMs for tactical and strategic disruption management; a disrupted data set, with over twelve thousand instances of delayed flight schedules due toweather-related disruptions, is used to calibrate the
Intra-State
HMMs for op-34rational disruption management and all
Inter-State
HMMs respectively. All disrupted and non-disrupted data sets used for training and validation are in-stantiated and segmented by using a random seed of 42 to ensure reproduciblemodels. and
Inter-State
HMM learning for the development of the UTFMis implemented first by fitting data feature samples for hidden states to standardnormal probability distributions that define the components of the initial mea-sure of the UTFM. Next, samples (set) of observed data features are grouped asobservations and the initial HMM state transition parameters are set as uniformdistributions based upon the total number of hidden states, before invoking theBaum-Welch algorithm set to a convergence criterion of 1 e − . We perform a5-fold cross validation (Kohavi, 1995) of Baum-Welch training on the sets ofobservations by examining marginal probability distributions of latent statesacross different folds to ensure modeling uniformity and generalizability, for ap-probation of a candidate optimal Intra-State or Inter-State
HMM trained on thecomplete set of observations. The cross validation technique is used to assessthe performance (goodness) of a trained HMM (for the UTFM) for estimatingthe likelihood of new observation (input) data, by verifying that the sums of thelog likelihood of an appropriate test set of observations across each of the fivefolds and corresponding state probability distributions are consistent (Ogunsinaet al., 2019).
Upon utilizing refined training data to learn the optimal parameters for
Intra-State and
Inter-State
HMMs, a hidden Markov model (i.e. probabilisticfinite state machine) representation of the UTFM is assembled to enable thedecoding of new (unseen) data that represent disrupted flight schedules, bysetting the weighted estimates of Viterbi probabilities estimated from all
Intra-State and
Inter-State
HMMs as parameters of the aggregate left-right
HMM35hat represents the UTFM, before applying the Viterbi algorithm to decode(predict) the most likely sequence of state components (i.e. phases of activityduring airline disruption management) in the UTFM due to observed inputsfrom a specific disrupted flight schedule.
Fig. 10 shows an optimal state transition graph for hidden state featuresfrom a trained
Intra-State
HMM for remaining in the Turnaround Decision(TAD) phase of activity in the UTFM. Based upon the graph shown in Fig. 10,a specialist agent will commence decision-making for the turnaround phase ofactivity in the UTFM for operational disruption management first by assess-ing how much time there is until the scheduled aircraft pushback time, beforeconsidering (transitioning) to adjust the aircraft turnaround time (i.e. startprobability of 1 and transition probability of 1). In the less likely event thatthe specialist agent does not return to assessing the time remaining prior tothe scheduled aircraft pushback, a consideration to swap the aircraft is mostlikely (transition probability of 0.18) and there is a 77% likelihood that the pro-cess to swap the aircraft type will continue throughout the turnaround phase
Figure 10: State transition graph of optimal
Intra-State
HMM for remaining in turnarounddecision (TAD) igure 11: State transition graph of optimal Inter-State
HMM for transition from turnarounddecision (TAD) to turnaround outcome (TAO) of flight operation during operational disruption management. Fig. 10 revealsthat there is almost no prerogative for the specialist agent to consider delayingaircraft pushback time after swapping aircraft during the turnaround phase offlight operation for operational disruption management, as evidenced by thenegligible transition probability of 1%.
Fig. 11 shows an optimal state transition graph for hidden state featuresfrom a trained
Inter-State
HMM for transitioning from the Turnaround De-cision (TAD) phase of activity to the Turnaround Outcome (TAO) phase ofactivity in the UTFM. From the graph shown in Fig. 11, a specialist agent willmost likely commence the transitioning from operational decision-making for theturnaround phase of activity to strategic (proactive) decision-making for a fu-37ure turnaround phase of activity in the UTFM for disruption management, firstby assessing flight swap (start probability of 0.92 and internal state probabilityof 0.38), before a most likely transition to consider adjusting aircraft turnaroundtime (transition probability of 0.47 and internal state probability of 0.07). Inthe much less likely event that the specialist agent commences the transitionto strategic disruption management by considering delay time before pushbackfirst (start probability of 0.08 and internal state probability of 0.07), there is a57% likelihood that the decision to adjust the turnaround time will follow, andtransitioning for strategic disruption management of the turnaround phase offuture flight operation concludes by assessing the work shift (time available) forthe next aircraft pushback schedule (end probability of 0.91 and internal stateprobability of 0.09).Unlike the ergodic structure of the optimal state transition graph for theTAD
Intra-State
HMM represented in Fig. 10, the optimal state transition graphfor the
Inter-State
HMM for transitioning between TAD and TAO phases ofactivity in the UTFM (depicted in Fig. 11) is modeled as a non-ergodic structureby introducing an absorption state (i.e. ‘end’ state) to characterize a definitetransition process between both phases of activity. Thus, we apply ergodic (andnon-ergodic) properties to determine the optimal parameters of all
Intra-State and
Inter-State
HMMs that constitute different phases of activity in the UTFM.
5. UTFM Results
We now evaluate two distinct flight schedules, impacted by two differentkinds of weather-related disruptions (i.e. uncertainty from indeterminate aleatoricfeatures), which represent two separate samples from disrupted test (unseen)data set, by employing the UTFM for airline disruption management. We se-lected these flight schedules as candidate test subjects for our demonstrationbecause they represent major routes in the network of the U.S. airline car-rier that provided the data which enabled the development of the UTFM. Forour assessments, we implement an aggregate non-ergodic HMM representation38 igure 12: Probabilistic graphical map for UTFM assessment of a specific disrupted DAL-HOU flight of the UTFM, such that the disruption management process strictly starts atTurnaround Schedule (TAS) phase of activity and ends at Taxi-In Outcome(TIO) phase of activity.Fig. 12 shows the probabilistic graphical model representation of the UTFMfor disruption management on the operation of a specific flight from Dallas to39ouston (DAL-HOU), which was disrupted by air traffic control (ATC) holdfor bad weather at Dallas (i.e.
HDO6 delay code). Fig. 12 reveals that thereis a 100% likelihood that a specialist agent transitions to employ reactive dis-ruption management measures from tactical disruption management measuresduring the turnaround phase of flight operation at Dallas (100% transition prob-ability from TAS to TAD). As such, to effectively resolve the same disruptioninstance in the future, the most likely approach is adjust or update features inthe turnaround, taxi-out and enroute phases of flight operation accordingly, asevidenced by internal state probabilities of 16%, 6%, and 3% for remaining inthe TAO, TOO, and EO phases of activity respectively. Furthermore, Fig. 12reveals that the tactical disruption management initiative implemented for theturnaround flight phase to address the ATC hold for inclement weather at Dallasfor that particular Dallas to Houston flight was ineffective, as evidenced by thelack of transition from the turnaround phase of flight operation to the taxi-outphase of operation (i.e. zero probability of transition from TAS to TOS). Assuch delays were most likely incurred during the turnaround phase of operationwhile executing that particular flight from Dallas to Houston. However, tacticalinitiatives proved somewhat effective during the taxi-out, enroute, and taxi-inphases of activity for disruption management of the Dallas to Houston flight,affirmed by internal state probabilities (i.e. interaction of hidden data featuresin
Intra-State
HMMs) of 4%, 3%, and 10% for remaining in the TOS, ES, andTIS phases of activity respectively.Fig. 13 shows the probabilistic graphical model representation of the UTFMfor disruption management on the operation of a specific flight from Chicagoto Boston (MDW-BOS), which was disrupted by ATC hold for bad weatheren route to or at Boston (i.e.
HDO7 delay code). Fig. 13 affirms that it ismore likely that the tactical disruption management measures a specialist agentemploys for disruption management of bad weather at Boston are proactivelyeffective for the turnaround and taxi-out phases of flight operation, as indicatedby internal state probabilities of 0.16 and 0.57 for TAS and TOS respectivelyand zero likelihood of transitions from those states to TAD and TOD respec-40 igure 13: Probabilistic graphical map for UTFM assessment of a specific disrupted MDW-BOS flight tively. Even though the tactical disruption management measures for addressingthe inclement weather disruption at Boston in the enroute and taxi-in phases ofactivity are somewhat effective, there may be situations where decision-makingfor reactive disruption management at the enroute and taxi-in phases of activityduring schedule execution may prove useful; as evidenced by the state transi-tion probabilities of 0.16 and 0.59 from ES to ED and TIS to TID respectively.Furthermore, Fig.13 reveals that the proactive tactical disruption management41easures for the turnaround and taxi-out phases of operation, implementedprior to departure from Chicago, were optimally effective for resolving ATC de-lay at Boston, as there are no transitions from TAS to TAD and TOS to TODphases on activity in the UTFM. As such delays during the flight were accruedat the enroute and taxi-in phases of operation during disruption management.However, the UTFM representation from Fig.13 reveals that strategic disrup-tion management initiatives to improve the future disruption resolution for thisparticular flight from Chicago to Boston, due to uncontrollable aleatoric uncer-tainty from inclement weather at Boston, do exist for turnaround, taxi-out andenroute phases of flight operation; as indicated by internal state probabilities of17%, 60%, and 64% for remaining in the TAO, TOO, and EO phases of activityrespectively.
6. Conclusion
Existing practices for airline disruption management are defined by human-centric methods that do not definitively examine uncertainty in long-term (proac-tive) and short-term (reactive) scheduling initiatives for mitigating irregularoperations during schedule execution. To this end, we introduced and demon-strated a data-driven and modular activity recognition framework that utilizesa unique class of probabilistic graphical models (i.e. the hidden Markov model)to learn and assess pertinent patterns and behaviors for proactive (tactical) dis-ruption management prior to schedule execution, reactive (operational) disrup-tion management during schedule execution and proactive (strategic) disruptionmanagement after schedule execution; all of which are necessary for achievingrobust airline disruption management. An effective application of two differentclasses of dynamic programming algorithms, i.e. the Baum-Welch and Viterbialgorithms, were used to respectively learn and decode the parameters of dif-ferent HMMs that constitute an overarching HMM required for enabling theassessment of two real-world flight schedules from a major U.S. airline network,disrupted due to different weather-related delays during schedule execution.42he implications of the results from the two particular weather-disruptedflight schedules assessed in this paper reveal that disruption resolution mea-sures enforced during phases of flight where the aircraft is on the ground (e.g.turnaround and taxi-in) are tantamount to attaining robust airline disruptionmanagement. Decision-making initiatives employed at phases of flight wherethe aircraft is on the ground are very likely to propagate to the airborne phasesof flight operation, consequently shaping the disruption management outlookfor a particular disrupted flight. Furthermore, our relational dynamic Bayesiannetwork (RDBN) architecture—for the assessment of uncertainty transfer be-tween different phases of flight operation and schedule evolution—proved usefulin rationalizing complex interactions of separate drivers for proactive and re-active disruption management at different phases of activity during the airlinescheduling process. For air traffic control hold arising from inclement weather atthe departure airport, the RDBN (illustrated by Figure 12) revealed a severedtransition between the turnaround and taxi-out phases of flight during tacti-cal disruption management. Thus, prior to schedule execution, the likelihoodof effectively completing a scheduled flight—given weather-related disruptionsat the departure airport during schedule execution—is sensitive to foresighteddisruption management initiatives enacted for turnaround and taxi-out phasesof flight. For air traffic control hold originating from inclement weather at thearrival airport, the RDBN (illustrated by Fig. 13) revealed a complete transi-tion process between all respective phases of flight during tactical disruptionmanagement. Hence, given weather-related disruptions at the arrival airportduring schedule execution, the likelihood of practically completing a scheduledflight is unlikely to be affected prior to schedule execution.
7. Future Work
Although the work presented in this paper introduces a novel data-drivenconcept and its application for uncertainty quantification and propagation inthe airline scheduling process for robust disruption management, there exist a43ew areas for further research. First, the data used to inform the developmentof the uncertainty transfer function model (UTFM), based upon our RDBN ar-chitecture, was provided by an airline that primarily operates a point-to-pointroute network structure. As such, there is a need for investigation of an equiv-alent framework developed based upon data from a major airline that utilizesa hub and spoke route network. Moreover, to facilitate system-wide disruptionmanagement measures like the FAA collaborative decision making initiative,readily accessible data from other air transportation system stakeholders (suchas airports) can be inculcated to improve the efficacy of the RDBN architecture(UTFM) for disruption management.Second, the selection of specific data features for different phases of activityin the construction of the UTFM introduced in this paper is primarily informedby literature and expert inputs of human specialists from one airline, and maycontain biases with respect to separate perspectives for different objectives of airtransportation stakeholders for system-wide disruption management. As such,proven non-parametric and unsupervised machine learning techniques can beemployed to mitigate and validate biases for ensuring a fairly objective selectionof features to represent different air transportation system stakeholders for ro-bust disruption management in the national airspace system. Furthermore, theBaum-Welch algorithm presents an inherently suboptimal unsupervised learningroutine for obtaining component HMMs of the UTFM. To that effect, more re-search to ensure and enhance solution fidelity of unsupervised machine learningmethods is most opportune.
Acknowledgement
The authors would like to thank Blair Reeves, Chien Yu Chen, Kevin Wiecek,Jeff Agold, Dave Harrington, Rick Dalton, and Phil Beck, at Southwest AirlinesNetwork Operations Control (SWA-NOC), for their expert inputs in abstractingthe data used for this work. 44 onflict of Interest
All authors have no conflict of interest to report.45 eferences
Amadeus IT Group (2016). Airline DisruptionManagement. URL: .Ball, M., Barnhart, C., Nemhauser, G., & Odoni, A. (2006). Air Transportation: Irregular Operations and Control.
Handbooks of Operations Research andManagement , (pp. 1–71).Barbati, M., Bruno, G., & Genovese, A. (2012). Applications of agent-basedmodels for optimization problems: A literature review.
Expert Systems withApplications , , 6020–6028. URL: http://dx.doi.org/10.1016/j.eswa.2011.12.015 . doi: .Barnhart, C. (2009). Irregular Operations: Schedule Recovery and Robustness.In The Global Airline Industry (pp. 253–274). doi: .Baum, L. E., & Petrie, T. (2007). Statistical Inference for Probabilistic Func-tions of Finite State Markov Chains.
The Annals of Mathematical Statistics ,. doi: .Bilmes, J. (2011). A Gentle Tutorial of the EM Algorithm and its Applicationto Parameter Estimation for Gaussian Mixture and Hidden Markov Models,. , 42–45. doi: .Bishop, C. M. (2006). Pattern Recognition and Machine Learning volume 4.URL: . doi: .Boussemart, Y., Las Fargeas, J., Cummings, M. L., & Roy, N. (2012). Compar-ing Learning Techniques for Hidden Markov Models of Human SupervisoryControl Behavior. doi: .46. E. Rasmussen, & Williams, C. K. I. (2006).
Gaussian Processes for MachineLearning .Cao, F., Bryant, B. R., Burt, C. C., Raje, R. R., Olson, A. M., & Auguston,M. (2005). A component assembly approach based on aspect-oriented gener-ative domain modeling.
Electronic Notes in Theoretical Computer Science , , 119–136. URL: http://dx.doi.org/10.1016/j.entcs.2004.02.070 .doi: .Castro, A. J. M., Paula, A., Eugenio, R., & Oliveira, E. (2014). Studies in Com-putational Intelligence 562 Ana Paula Rocha A New Approach for DisruptionManagement in Airline Operations Control .Chase, S. C. (2005). Generative design tools for novice designers: Issues forselection. In
Automation in Construction . doi: .Ching, W. K., Ng, M. K., & Fung, E. S. (2008). Higher-order multivariateMarkov chains and their applications.
Linear Algebra and Its Applications , , 492–507. doi: .Clarke, M. D. D. (1998). Irregular airline operations: a review of the state-of-the-practice in airline operations control centers. Journal of Air TransportManagement , . doi: .Czarnecki, K. (2005). Overview of generative software development. In
LectureNotes in Computer Science . doi: .Dayan, P., & Niv, Y. (2008). Reinforcement learning: The Good, The Bad andThe Ugly. doi: .Forney, G. D. (1973). The Viterbi Algorithm.
Proceedings of the IEEE , .doi: .Fox, C. R., & Ulkumen, G. (2011). Distinguishing two dimensions of uncertainty.
Perspectives on Thinking, Judging, and Decision Making , .47riedman, N., Getoor, L., Koller, D., & Pfeffer, A. (1999). Learning probabilis-tic relational models. In
IJCAI International Joint Conference on ArtificialIntelligence . doi: .Frydenberg, M. (1990). The chain graph Markov property.
Scandinavian journalof statistics , .Galaske, N., & Anderl, R. (2016). Disruption Management for Resilient Pro-cesses in Cyber-physical Production Systems.
Procedia CIRP , , 442–447.URL: http://dx.doi.org/10.1016/j.procir.2016.04.144 . doi: .Gershkoff, I. (2016). Shaping the future of Airline Dis-ruption Management (IROPS), . (pp. 1–32). URL: https://amadeus.com/documents/en/airlines/white-paper/shaping-the-future-of-airline-disruption-management.pdf .Getoor, L., & Taskar, B. (2007). Introduction to Statistical Relational Learning(Adaptive Computation and Machine Learning) . The MIT Press.Ghahramani, Z. (2001). An Introduction to Hidden Markov Models andBayesian Netweoks.
International Journal of Pattern Recognition and Ar-tificial Intelligence , . doi: .Grosche, T. (2009).
Computational Intelligence in Integrated Airline Scheduling .Haggstrom, O. (2002).
Finite Markov Chains and Algorithmic Applications .doi: .Hao, L., & Hansen, M. (2013). How airlines set scheduled block times. In
Proceedings of the 10th USA/Europe Air Traffic Management Research andDevelopment Seminar, ATM 2013 .Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracyestimation and model selection.
Proceedings of the 14th international jointconference on Artificial intelligence - Volume 2 , , 1137–1143.48ohl, N., Larsen, A., Larsen, J., Ross, A., & Tiourine, S. (2007). Airlinedisruption management-Perspectives, experiences and outlook. Journal of AirTransport Management , , 149–162. doi: .Koller, D., & Friedman, N. (2009). Probabilistic Graphical Models: Principlesand Techniques, . doi: .Lee, J., Marla, L., & Jacquillat, A. (2018). Dynamic Airline Disruption Man-agement Under Airport Operating Uncertainty. SSRN Electronic Journal , , 1–41. doi: .Letham, B., & Rudin, C. (2012). Probabilistic Modeling and Bayesian Analy-sis. Prediction: Machine Learning and Statistics Lecture Notes , (pp. 1–42).URL: http://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/ .Liskov, B. (1988). Data Abstraction and Hierarchy.
ACM SIGPLAN Notices , , 17–34. doi: .Marla, L., Vaaben, B., & Barnhart, C. (2017). Integrated disruption manage-ment and flight planning to trade off delays and fuel burn. TransportationScience , , 88–111. doi: .Midkiff, A. H., Hansman, R. J., & Reynolds, T. G. (2004). Air Carrier FlightOperations . Technical Report July MIT International Center for Air Trans-portation Cambridge, MA. URL: https://dspace.mit.edu/handle/1721.1/35725 .Nathans, D. (2015). Efficient operations: Building an operations center from theground up. In
Designing and Building Security Operations Center chapter 1.(pp. 1–24). Elsevier. URL: /linkinghub.elsevier.com/retrieve/pii/B978012800899700001X .doi: .Neville, J., & Jensen, D. (2007). Relational dependency networks. Journal ofMachine Learning Research , . doi: .Ogunsina, K., Bilionis, I., & DeLaurentis, D. (2021). Exploratory Data Analysisfor Airline Disruption Management, . URL: http://arxiv.org/abs/2102.03711 .Ogunsina, K. E., Papamichalis, M., Bilionis, I., & DeLaurentis, D. A.(2019). Hidden Markov Models for Pattern Learning and Recognition ina Data-Driven Model for Airline Disruption Management. doi: .Omura, J. K. (1969). On the Viterbi Decoding Algorithm.
IEEE Transactionson Information Theory , . doi: .Pomerol, J. (1997). Artificial intelligence and human decision making.
European Journal of Operational Research , , 1–28. URL: .doi: .Reid Turner, C., Fuggetta, A., Lavazza, L., & Wolf, A. L. (1999). A conceptualbasis for feature engineering. Journal of Systems and Software , , 3–15.doi: .Rosenberger, J. M., Schaefer, A. J., Goldsman, D., Johnson, E. L., Kleywegt,A. J., & Nemhauser, G. L. (2000). SimAir: A stochastic model of airlineoperations. Winter Simulation Conference Proceedings , , 1118–1122. doi: .Sanghai, S., Domingos, P., & Weld, D. (2005). Relational dynamic bayesiannetworks. Journal of Artificial Intelligence Research , , 759–797. doi: . 50chreiber, J. (2016). pomegranate Documentation, . URL: http://pomegranate.readthedocs.io/ .Seger, C. (2018). An investigation of categorical variable encoding techniquesin machine learning: binary versus one-hot and feature hashing. DegreeProject Technology , (p. 41). URL: .Sousa, H., Teixeira, R., Cardoso, H. L., & Oliveira, E. (2015). Airline disrup-tion management: Dynamic aircraft scheduling with ant colony optimization.In
ICAART 2015 - 7th International Conference on Agents and ArtificialIntelligence, Proceedings . doi: .T. Vincenty (1975). Direct and Inverse solutions of geodesics on the ellipsoidwith application of nested equations.
Survey Review , .Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., & Carrasco, R. C.(2005). Probabilistic finite-state machines - Part II. doi: .Viterbi, A. J. (1967). Error Bounds for Convolutional Codes and an Asymp-totically Optimum Decoding Algorithm.
IEEE Transactions on InformationTheory , . doi: .Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning.
Machine Learning , ,279–292. URL: http://link.springer.com/10.1007/BF00992698 . doi: .Yang, J., Xu, Y., & Chen, C. S. (1997). Human action learning via hiddenMarkov model. IEEE Transactions on Systems, Man, and Cybernetics PartA:Systems and Humans. , , 34–44. doi: .51 ppendix A. Nomenclature for determinate aleatoric data featuresAleatoric Data Fea-ture Description Observation InputCategory dow Day of the week FREQ doy
Day of the year FREQ dest x dir
Destination airport location inspherical X coordinate DEST dest y dir
Destination airport location inspherical Y coordinate DEST dest z dir
Destination airport location inspherical Z coordinate DEST moy
Month of the year FREQ
ONBD CT
Total number of passengers on-board flight PAX DMD orig x dir
Origin airport location in spher-ical X coordinate ORIG orig y dir
Origin airport location in spher-ical Y coordinate ORIG orig z dir
Origin airport location in spher-ical Z coordinate ORIG route
Spherical distance between ori-gin and destination airports RTE sched route originator flag
Flag to indicate first flight of theday ORIG season
Season of the year FREQ52 ppendix B. Nomenclature for indeterminate aleatoric featuresAleatoric DataFeature Description Observation In-put Category Functional Role
HD03
Weather holding DISRP Weather
HD06
ATC gate hold forweather at depar-ture station DISRP Weather
HD07
ATC gate hold forweather at enrouteor at destinationstation DISRP Weather
HD08
Ice on wings / cold-soaked fuel DISRP Weather
HD09
Deicing at gate DISRP Weather
MX05
Inspection due tolightning strike DISRP Weather
MX07
Inspection due toturbulence DISRP Weather
MXO8
Hail ice, or snowdamage DISRP Weather53 ppendix C. Nomenclature for Epistemic Data FeaturesEpistemic Data Fea-ture Description Activity Phase inUTFM
ACTL ACFT TYPE
Actual aircraft type used TAO actl block mins
Actual blocktime period TOO, EO, TIO actl enroute mins
Actual flight period in the air EO
ACTL TURN MINS
Actual turnaround period TAO
ADJST TURN MINS
Adjusted turnaround period TAD
DELY MINS
Total delay period before actualpushback TAD, TOD
DOT DELAY MINS
Total arrival delay ED, TID late out vs sched mins
Total departure delay TOD
SCHED ACFT TYPE
Scheduled aircraft type used TAS sched block mins
Scheduled blocktime period TOS, ES, TIS
SCHED TURN MINS
Scheduled turnaround period TAS shiftper actl GP % work shift completed at ac-tual gate parking time TID shiftper actl LD % work shift completed at ac-tual landing time ED shiftper actl PB % work shift completed at ac-tual pushback time TOD shiftper actl TO % work shift completed at ac-tual takeoff time ED shiftper sched GP % work shift completed atscheduled gate parking time TID shiftper sched PB % work shift completed atscheduled pushback time TAD
SWAP FLT FLAG
Flight swap flag TAS, TAD, TAO taxi in
Taxi-in period TIS, TIO54 axi out
Taxi-out period TOS, TOO tod actl GP
Actual aircraft gate parkingtime at destination TIO tod actl LD
Actual aircraft landing time atdestination EO tod actl PB
Actual aircraft pushback timeat origin TAO tod actl TO
Actual aircraft takeoff time atorigin TOO tod sched GP
Scheduled aircraft gate parkingtime at destination TIS tod sched PB
Scheduled aircraft pushbacktime at origin TAS55 ppendix D. Dynamic Programming Algorithm 1Algorithm 1
Baum-Welch Algorithm procedure BaumWelch ( Y, X ) A, B, α, β ∈ Y for t = 1 : N do γ (: , t ) = α (: , t ) (cid:12) β (: , t ) (cid:80) ( α (: , t ) (cid:12) β (: , t )) ξ (: , : , t ) = ( α (: , t ) (cid:12) A ( t + 1)) ∗ ( β (; , t + 1) (cid:12) B ( X t +1 )) T (cid:80) ( α (: , t ) (cid:12) β (: , t )) end for (cid:46) where N = | X | ˆ π = γ (: , (cid:80) ( γ (: , for j = 1 : K do ˆ A ( j, :) = (cid:80) ( ξ (2 : N, j, :) , (cid:80) ( (cid:80) ( ξ (2 : N, j, :) , , ˆ B ( j, :) = X (: , j ) T γ (cid:80) ( γ, end for (cid:46) where K is number of states return ˆ π, ˆ A, ˆ B end procedure ppendix E. Dynamic Programming Algorithm 2Algorithm 2 Viterbi Algorithm procedure Viterbi ( Y, X ) A, B, π ∈ Y Initialize: δ = π ◦ B X , a = 0 for t = 2 : N do for j = 1 : K do [ a t ( j ) , δ t ( j )] = max i (log δ t − (:) + log A ij + log B X i ( j )) end for (cid:46) where K is number of states end for (cid:46) where N = | X | Z ∗ N = arg max δ N for t = N − do Z ∗ t = a t +1 Z ∗ t +1 end for return Z ∗ N end procedureAppendix F. Dynamic Programming Algorithm 3 lgorithm 3 UTFM Learning Algorithm procedure UTFMlearning ( X, Y ) X S = { s , ..., s m } , X D = { d , ..., d m } , X O = { o , ..., o m } (cid:46) Disruptedflight data for all j ∈ (1 , , ..., m ) do S (cid:48) ← S j , D (cid:48) ← D j , O (cid:48) ← O j , A (cid:48) ← α ij : S i → S j , B (cid:48) ← β ij : D i → D j , Γ (cid:48) ← γ ij : O i → O j , K (cid:48) ← κ j : S j → D j , Λ (cid:48) ← λ j : D j → O j (cid:46) for i = j − i > M (cid:48) ← {S (cid:48) , D (cid:48) , O (cid:48) , A (cid:48) , B (cid:48) , Γ (cid:48) , K (cid:48) , Λ (cid:48) } (cid:46) Initialize Optimal HMM sets forUTFM while | X S | , | X D | , | X O | > m or ¬M (cid:48) do Y S = { y s , ..., y sl } , Y D = { y d , ..., y dl } , Y O = { y o , ..., y ol } (cid:46) Training(flight schedule) data for l ≥ m S (cid:48) j ← BaumWelch ( S j , Y S ), D (cid:48) j ← BaumWelch ( D j , Y D ), O (cid:48) j ← BaumWelch ( O j , Y O ), α (cid:48) ij ← BaumWelch ( α ij , Y S ), β (cid:48) ij ← BaumWelch ( β ij , Y D ), γ (cid:48) ij ← BaumWelch ( γ ij , Y O ), κ (cid:48) j ← BaumWelch ( κ j , Y D ), λ (cid:48) j ← BaumWelch ( λ j , Y O ) S (cid:48) ← S (cid:48) j , D (cid:48) ← D (cid:48) j , O (cid:48) ← O (cid:48) j , A (cid:48) ← α (cid:48) ij , B (cid:48) ← β (cid:48) ij , Γ (cid:48) ← γ (cid:48) ij , K (cid:48) ← κ (cid:48) j , Λ (cid:48) ← λ (cid:48) j (cid:46) Update Optimal HMM sets for UTFM end while end for N (cid:48) ← {S (cid:48) , D (cid:48) , O (cid:48) } , N (cid:105)→| ← { A (cid:48) , B (cid:48) , Γ (cid:48) } , N (cid:105)→(cid:105) ← { K (cid:48) , Λ (cid:48) } N → ← {N (cid:105)→(cid:105) , N (cid:105)→| } K ← ( N (cid:48) , N → ) (cid:46) Optimal (RDBN) Data Architecture for UTFM end procedure ppendix G. Dynamic Programming Algorithm 4Algorithm 4 UTFM Decoding Algorithm
Require: K (cid:46) Optimal UTFM Architecture procedure UTFMdecoding ( X ) P ( s ) ← Viterbi ( S (cid:48) , X S ), P ( d ) ← Viterbi ( D (cid:48) , X D ), P ( o ) ← Viterbi ( O (cid:48) , X O ), P ( α ) ← Viterbi ( A (cid:48) , X S ), P ( β ) ← Viterbi ( B (cid:48) , X D ), P ( γ ) ← Viterbi (Γ (cid:48) , X O ), P ( κ ) ← Viterbi ( K (cid:48) , X D ), P ( λ ) ← Viterbi (Λ (cid:48) , X O ) (cid:46) Unroll K withdisrupted flight information X for all j ∈ (1 , , ..., m ) do φ j ← P ( s j ) + P ( α ij ) + P ( κ j ), ψ j ← P ( d j ) + P ( β ij ) + P ( λ j ), ρ j ← P ( o j ) + P ( γ ij ) (cid:46) for i = j − i > a ← P ( s j ) φ j , b ← P ( α ij ) φ j , c ← P ( κ j ) φ j , p ← P ( d j ) ψ j , q ← P ( β ij ) ψ j , r ← P ( λ j ) ψ j ,u ← P ( o j ) ρ j , v ← P ( γ ij ) ρ j (cid:46) Stochastic matrix (state probabilities) forUTFM end for N ← { a, p, u } , N i → j ← { b, q, v } , N i → i ← { c, r } N → ← { N i → i , N i → j } K ← ( N , N → ) (cid:46) UTFM for disrupted flight return K end procedureend procedure