Addressless: A New Internet Server Model to Prevent Network Scanning
Shanshan Hao, Renjie Liu, Zhe Weng, Deliang Chang, Congxiao Bao, Xing Li
AAddressless: A New Internet Server Model to PreventNetwork Scanning
Shanshan Hao , Renjie Liu , Zhe Weng , Deliang Chang , Congxiao Bao , XingLi , Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China Department of Electronic Engineering, Tsinghua University, Beijing, China* [email protected]
Abstract
Eliminating unnecessary exposure is a principle of server security. The huge IPv6address space enhances security by making scanning infeasible, however, with recentadvances of IPv6 scanning technologies, network scanning is again threatening serversecurity. In this paper, we propose a new model named addressless server, whichseparates the server into an entrance module and a main service module, and assigns anIPv6 prefix instead of an IPv6 address to the main service module. The entrancemodule generates a legitimate IPv6 address under this prefix by encrypting the clientaddress, so that the client can access the main server on a destination address that isdifferent in each connection. In this way, the model provides isolation to the mainserver, prevents network scanning, and minimizes exposure. Moreover it provides anovel framework that supports flexible load balancing, high-availability, and otherdesirable features. The model is simple and does not require any modification to theclient or the network. We implement a prototype and experiments show that our modelcan prevent the main server from being scanned at a slight performance cost.
Introduction
Exhaustion of IPv4 addresses has long been recognized and is now a reality. IPv6 [1]was proposed in 1995 to solve this problem. The main improvement is the 128-bitaddress over the 32-bit IPv4 address, together with other goals like end-to-end featureand better security. As of March 2020, about 25% information resources ( websites,emails, etc.), 60% DNS servers, and 30% Internet clients support IPv6 . With the riseof the Internet of Things, 5G, and cloud computing, it is predicted that more than 75billion devices will be connected to the Internet by 2025 , while there are only 4 billionIPv4 addresses in total. An inevitable and faster adoption of IPv6 can be expected. ITgiants such as Facebook and Microsoft have been moving to an IPv6-only internalnetwork. An IAB (Internet Architecture Board) statement expected that the IETFwould stop requiring IPv4 compatibility in new or extended protocols, and future workwould optimize for and depend on IPv6 . Pure IPv6 is the future of the Internet. https://teamarin.net/2019/04/03/microsoft-works-toward-ipv6-only-single-stack-network/ September 29, 2020 1/33 a r X i v : . [ c s . N I] S e p Pv6 is developed with security in mind, realizing that security as well as privacy hasalways been one of the biggest threats to the Internet. It is natural to start withutilizing the massive address space of IPv6 to enhance security and privacy, consideringthat this is its biggest difference from IPv4.By making network scanning difficult, this massive address space of IPv6 hasnaturally provided preliminary protection. The IP address is the identifier and locatorof the Internet. Thus network scanning is usually the first phase of an attack to obtainthe IP addresses of potential victims. This is easy in IPv4. The scanning time of theentire IPv4 address space is only about 45 minutes [2]. However, it takes more than 100quintillion years to scan the entire IPv6 address space with the same efficiency. Themassive address space of IPv6 helps a lot to hide the address of a device.However, on one hand, NAT in the IPv4 era, albeit not designed for security, bringsthe byproduct of hiding devices and topology of the network from the outside. NAT isstateful and violates the end-to-end principle, thus deprecated by IPv6 designers. Manynetwork administrators, however, are so used to the invisibility provided by NAT thatthey feel that NAT-less IPv6 rather poses a threat of exposure, despite the difficulty ofscanning IPv6’s huge address space.On the other hand, recent advances of IPv6 scanning technology have threatened thepreliminary invisibility provided by the huge address space. Various approaches havebeen proposed to scan the IPv6 Internet more efficiently mainly in two ways: collectingactive IPv6 address records [3–10], and using statistical and machine learning methodsto generate hitlists [11–16]. Most of these approaches are server-specific. Scanning IPv6clients remains difficult, which brings many security benefits for the clients.We cannot help but think, can we further utilize the IPv6 address space to makeIPv6 servers unscannable as well, avoiding as much unnecessary exposure as possible,thereby allowing servers to enjoy the security advantages brought by invisibility, but notviolating end-to-end reachability?
There have been attempts to enhance the anti-scanning feature of the IP addresses,mainly by: 1) generating addresses that is semantically opaque and more random [17],2) using temporal [18, 19] or hopping [20–25] addresses with short lifetime, and further3) using a unique address for each connection [26–30] or even packet [31, 32].The former two approaches are basically incremental improvements. Each devicestill has one IP at a time which can be scanned. The per-connection address is aninteresting idea but previous models are very complex, unscalable, and hard to deploy.Specifically, they largely remain within the framework of dynamically shared addresspools. Address mapping and address collision become problematic, the network needs tobe modified to enable routing, and the system is complicated. Moreover, few works areabout servers, and generally require synchronous cooperation of the client-side.On the other hand, the current network security model relies on encryption. Forinstance, SEND [33], TLS [34], and DNSSEC [35] are used at the data link layer, thetransport layer, and the application layer, respectively. At the network layer, IPsec [36]encrypts the payload of IP packets but the (outer) IP addresses are left exposed. SinceTCP/IP was introduced, the creators of the Internet have considered introducingencryption into the IP address itself. This is an extravagant dream in the era of IPv4since addresses are scarce and reused with great care. However, this is changed in IPv6,and the 128-bit address provides sufficient space to carry the encrypted information.Previous work [30–32] propose to encrypt host identity into pseudo-random temporaladdresses, so that the host identity is hidden from the outside network. But it has to bedone on local routers, otherwise routing will fail. The network needs to be modified andit is sort of encapsulation. CGA [37] applies hashing in auto-generation of addresses,but it aims to solve link-local address spoofing and again the addresses are exposed.The literature has not seen a simple and scalable mechanism to introduce encryptionSeptember 29, 2020 2/33nto the IP address itself.In this paper, a model named addressless public server is proposed. We creativelyuse the prefix delegation mechanism [38] in a way that is different from its originalintention. So that we introduce encryption into the per-connection address in a verysimple way and no modification of other participants is needed. And we make it usablefor public servers to make them difficult to be scanned. Not only is securitystrengthened in this way, but our model also offers a novel architecture to enableflexible high-availability and other features.The addressless sever has two modules, one module provides independent entrance ofthe server, the other provides the main services. When receiving access requests, theentrance module generates a destination address using encryption, and redirects therequest to the generated destination address. This destination address is different foreach connection request. The main service module is allocated a prefix instead of anIPv6 address and listens on all the addresses under the prefix. When the main servicemodule receives a data flow, it conducts verification on the destination address, andonly responds to flows that pass the verification. Scanning traffic and attackers thatvisit the main service module directly cannot pass the verification, therefore will beimmediately dropped. In this way, we make the main server imperceptible.This model separates the network entrance and provides isolation for the mainserver. It allocates a prefix instead of an address to the main server, so that it no longerhas one address at a time, thus imperceptible by the outside network. At the same time,our model naturally supports flexible high-availability solutions such as lightweight loadbalancing, active-active cluster, and CDN. More security and functional mechanismscan be developed based on it. All in all, our model provides a novel perspective onvarious problems faced by public servers.
Background and Related Work
Our model uses the massive IPv6 address space to prevent the server from beingscanned. So in this section, we first introduce network scanning and IPv6 address spacesecurity. Then we introduce prefix delegation, the address configuration mechanism thatour model built upon. Finally, we make a detailed analysis of previous work in usingIPv6 address space to enhance security, and compare our model to the most relatedones.
Network Scanning
Network scanning is a technology to collect the active addresses by sending packets to ahuge set of addresses. Early scanning techniques like Nmap [39] often take days to scanthe entire IPv4 address space. Zmap [2] proposed in 2013 reduces the scanning time ofthe entire IPv4 address space to about 45 minutes. This makes the cost of scanningunder IPv4 negligible.Due to the massive address space, network scanning in IPv6 has always beenconsidered impossible. Although Zmapv6 [40] is proposed to scan IPv6 addresses, it isonly a scanning tool that uses the same technique as Zmap without improving scanningefficiency. However, a series of techniques has been proposed in recent years to reducethe difficulty of IPv6 scanning.The work on IPv6 scanning can be divided into two types. One is to generate hitlistsby collecting active addresses on the Internet. Various sources can be used to collectactive addresses. Fiebig et al. [3] propose to generate the hitlist using DNS data, andBorgolte et al. [4] propose to generate the hitlist using DNSSEC-signed reverse zones,while rDNS data is used by Fiebig et al. [5]. These researches mainly focus onSeptember 29, 2020 3/33enerating the hitlists of Internet servers, while Beverly et al. [6] and Rohrer et al. [7]introduce approaches to collect router addresses, and Rye et al. [8] probe and collectaddresses of last-hop routers using traceroutes. Gasser et al. [9] summarize thesemethods and give a large hitlist set, and publish a compiled, open-source, andfrequently updated hitlist whose quality is enhanced using active measurements [10].The other is to generate hitlists by predicting active addresses using statistical ormachine learning algorithms. Foremski et al. [11] first propose that active IPv6addresses can be predicted by statistical algorithms. They introduce Entropy/IP, analgorithm to generate hitlists using the Bayesian algorithm. Ullrich et al. [12] introducea scanning algorithm based on pattern recognition, and Zuo et al. [13] analyze the activeaddresses and make predictions through association rule learning. Murdock et al. [14]propose 6Gen, an algorithm that uses some active addresses as seeds to generate hitlists.Liu et al. [15] introduce 6Tree, which uses hierarchical clustering to predict addressesand generate hitlists. Deep learning methods have also been introduced. Cui et al. [16]stack gated convolutional network to encode address structure and generate hitlists.
IPv6 Address Space Security
Since the birth of IPv6, researchers have been looking for ways to increase the securityand privacy of IPv6 addresses.In RFC 4291 [41], an IPv6 address is divided into two parts: a 64-bit prefix and a64-bit identifier interface (IID). The most widely used IPv6 address configurationmethods are SLAAC [42] and DHCPv6 [38]. In SLAAC, the IID is generated from theMAC address of the device using the EUI-64 algorithm [43]. This makes the IID part ofthe IPv6 address remaining constant in the lifetime of a device. In DHCPv6, the rulesof address configuration are determined by the network administrator. In the early days,these rules are simple and regular, which make the addresses used in the networkshowing obvious patterns. Scanning is quite easy for both SLAAC and DHCPv6addresses, thus posing a threat to security.To solve this problem, SLAAC Privacy Extension (SLAAC PE) is proposed in RFC4941 [18]. In RFC 4941, the client is recommended to use a temporary address tocommunicate with servers. The IID of the temporary address is one-time, generated byapplying the MD5 algorithm [44] to the previous IID. In this way, the security andprivacy of the client are enhanced in SLAAC. DHCPv6 also proposes an approach ofallocating temporary addresses [45], but it is not widely used because it brings a seriesof problems [46].More in-depth work is introduced on this basis. Semantically opaque IID isrecommended to be used in SLAAC and DHCPv6 by RFC 7217 [17], RFC 7493 [47],and RFC 8065 [48], etc. RFC 7707 [19] recommends to reduce the lifetime and increasethe randomness of IPv6 addresses used by network devices based on a summary ofscanning algorithms.At the same time, researchers also put forward some work on the measurement ofactive IPv6 address space. Plonka et al. [49] analyze the temporal and spatialcharacteristics of active IPv6 addresses. Li et al. [50] describe the distributioncharacteristics of Internet IPv6 prefixes. These works demonstrate how the IPv6addresses are actually used thus objectively shows the security status of the IPv6address space.There are also a few works focused on the detection and defense of IPv6 scanning.Fukuda et al. [51] introduce an approach to detect IPv6 scanning and evaluate relevantseverity. Plonka et al. [52] introduce kIP, a new approach to increase the anonymity ofIPv6 addresses.September 29, 2020 4/33
Pv6 Prefix Delegation
In our model, the main service module is assigned an IPv6 prefix. Assigning a prefix toa device is allowed in IPv6, typically using the DHCP-PD [38] mechanism. DHCP-PD isoriginally proposed to allow a DHCP server to assign a prefix to a DHCP client, so thatthis DHCP client can further allocate the addresses under the prefix to other devices. Inour model DHCP-PD is used for a different purpose. The main service module isassigned a prefix using DHCP-PD, but after that, it uses the addresses under the prefixitself, instead of allocating them to others.Besides DHCP-PD, prefix delegation is also used in other scenarios. RFC 8273 [53]allows each device in the same subnet to be configured with a unique IPv6 prefix, sothat they are logically under different subnets and cannot send packets to each otherexcept through the first-hop router. Using this, isolation is provided between thedevices in a shared-access network. However, each device only uses one fixed addressunder the prefix. To the best of our knowledge, the literature has not made deeper useof the prefix allocated to a device.
Using IPv6 Address Space to Enhance Security and Privacy
IP address hopping.
The technique to dynamically and frequently change the IPaddress of a device has long been used to prevent attackers from finding the target sincethe era of IPv4. In the IPv4 era, it can only be achieved with the help of the networkservice, such as DHCP servers [54] and SDN [24, 25]. In the IPv6 era, the massiveaddress space and SLAAC enable auto-configuration of dynamic addresses [21]. IPaddress hopping of a server is harder, since its address needs to be known by clients toallow inbound traffic. In previous work, the sender has to update the addressessynchronously with the receiver based on a shared secret [20, 22, 23].
Per-connection address and beyond.
IPv6 addressing model specificallysupports assigning multiple IP addresses to a single interface [41]. Thus researchershave gone beyond address hopping to assign each connection (or a set of closely relatedconnections) a unique address [26–30] to enhance privacy. Our model also usesper-connection addresses, and a detailed comparison is given later in this subsection.The per-connection address is not the end of the road either. Researchers extendthis idea temporally to propose per-packet address [31, 32], and spatially to proposeprefix alteration [55–57]. The per-packet address means to use a unique address for eachpacket. However, the stronger privacy is achieved at the cost of higher complexity ofdemultiplexing packets to flows and modification of local networks to enablerouting [31, 32]. Prefix alteration means not only the IID part but also the prefix part ofa device’s address can be varying. It can be achieved by prefix hopping, prefix bouquets,prefix sharing, and variable-length prefix [55, 56], or by exploiting mobile IPv6 [57].
Limitation of previous work.
The idea of using address space to hide nodeidentity stems from IP address hopping in the IPv4 era, which can only be achievedthrough a pool of dynamic addresses shared by a group of devices due to scarcity ofaddresses. Although previous works have applied the idea to IPv6 and extended it toper-connection or even per-packet address, in essence, they have not gone beyonddynamically shared addresses pool. That is, devices are still sharing a set of dynamicaddresses, although this set becomes enormous; it contains all the addresses under theIPv6 prefix.The scheme of dynamically shared addresses pool brings two problems. First,mapping an address to a device or a connection and the related routing are difficult.Early work is stateful, the mapping needs to be recorded [26, 28, 29]. More recent workachieve stateless mapping by encrypting the host identity into the one-timepseudo-random address. However, decryption needs to be done at the local routerSeptember 29, 2020 5/33ecause the host identity needs to be extracted for local routing [30–32]. Modificationof the network service makes it unlikely to be deployed.Second, IPv6 enables self-generated pseudo-random addresses without relying onstateful DHCP servers. But making sharing of addresses stateless brings the possibilityof address collisions. Previous works skip this problem by stating that the possibility isnegligible or use duplicate address detection to detect and avoid collisions [28, 29]. Someuse external unique identifiers to generate addresses, such as Electronic ProductCode [58] in IoT scenarios.Further, few works are about servers. On one hand, it is complex for a server tomaintain a huge set of addresses assigned to each of its connection under thisdynamically shared addresses pool scheme. For instance, Sakurai et al. [29] use slidingwindows to maintain active addresses. On the other hand, the contradiction betweeninbound traffic and dynamic addresses has not been solved. Sender and receiver need tocooperate and synchronously update a pseudo-random sequence of addresses calculatedbased on a secret [20, 22, 23, 29]. This is infeasible for public servers.
Comparison of our model.
We innovatively utilize the prefix delegationmechanism [38]. Our model is built on the idea of the per-connection address. However,the device is assigned all the addresses under the prefix, thus the complexity of routingand address collision is completely eradicated, and modification of the network service isno longer necessary. And specifically for a server with a huge amount of connections,the maintenance of the active addresses is no longer a problem.Prefix delegation also enables us to introduce encryption into the address at theendpoint. Previously it has to be done on local routers [30–32] otherwise thedynamically shared addresses cannot be routed properly. Together with a novel statelesssalting algorithm, we achieve stateless mapping of the per-connection address to theclient.And to make the public server accessible at the per-connection addresses, weinnovatively separate the Internet entrance module from the main service module, anduse the entrance module to generate the encrypted address and redirect the client. Inthis way, no modification of the client-side is needed. The exposed entrance modulebears no other logic and is simple, and the main service is isolated and hidden in thehuge address space.
Design of Addressless Server
In this section, the design of addressless server is presented.
Design Principles
The addressless server model is designed to prevent the server from being perceived andscanned by the attackers. To reach this goal, the model takes advantages of thefollowing features:1. Separate the Internet entrance module from the main service module, and provideisolation to the main service module.2. Allocate a prefix instead of an address to the main service module.3. Eliminate the one-to-one correspondence between the server and the IP address.4. Introduce encryption into IPv6 Address to make use of the redundant IPv6address space.September 29, 2020 6/33n this way, the model can provide a triple guarantee for the server: strip the Internetentrance from the main server, hide the legitimate address in the massive IPv6 addressspace, and use encryption to ensure that only the legitimate address generated by theentrance module can be visited. By decoupling the server and the IP address, attackerscan no longer use the IP address as the identification of the server. This is the meaningof “addressless”. Through this, the server is protected from being perceived andscanned by the outside devices, so that security is enhanced.
System Design
We divide the server into two modules, the entrance module and the main servicemodule. The entrance module has a fixed Internet address; the main service module isconfigured with a prefix and uses all the addresses under the prefix to communicatewith clients. The topology of the addressless server is shown in Fig 1.
First Hop Router
Internet
Entrance ModuleIPv6
Client IPv6 ClientIPv6 Client
Main Service Module
Fig 1. Topology of the Addressless Server.
The server is separated into theentrance module and the main service module. The main service module is directlyconnected to the first-hop router. The dotted line indicates that the entrance modulecan be deployed anywhere on the Internet.The main service module is directly connected to the first-hop router. The mainservice module and the first-hop router are both configured with a non-public IPv6address. This address is used for prefix delegation and routing. It is usually a link-localaddress. If there are more management requirements, other private addresses such asULA [59] can also be configured. The first-hop router routes all the packets whosedestination addresses are under this prefix to the main service module. The entrancemodule is configured with a fixed IPv6 address. This address should be configured as anAAAA record in the DNS system.When an Internet client initiates a connection with the server, it first sends a DNSquery to the DNS server. The DNS server returns the entrance address to the client.Then the client sends the request to this address. After receiving the request, theentrance module uses the prefix of the main service module and the source address ofSeptember 29, 2020 7/33he packet to calculate an IPv6 address through an encryption algorithm, then returnsthis address to the client. Finally, the client initiates connections with the main servicemodule using this address as the destination address.The calculation process can be formulated as the following equations. We denote theclient address as SA , the encryption process as function f () (the specific process of f ()is discussed in next subsection), and the prefix of the main service module as pref ix ,then the destination address is: DA N = pref ix (1) DA N +1:128 = f ( SA ) (2)To achieve compatibility, clients that are agnostic of our model should be able tocommunicate with the public server without modification. So we use the redirectionmechanism to achieve the process of entrance module returning the generated addressand client connecting that address. In this case, the entrance module temporarilyredirects the client’s request to the generated address. After receiving this message, theclient sends a new request to the main service module. When the main service modulereceives a flow, it verifies the destination address using the source address through thesame encryption algorithm. The verification process can be described by Eq (3) Res = g ( SA, DA N +1:128 )) (3)In Eq (3), g () is the verification function. Res is a bool value. True means the flowpasses the verification while False means the flow fails the verification. If theverification fails, the server discards the packets.The mechanism is described in Fig 2.
Router Entrance ModuleIPv6 Client Main Service ModuleDNS Server ① Request for AAAA Record ② DNS Server Returns the Address of the Entrance Module ③ User Visits Entrance Module ④ Entrance Module Generates an Address and Return a Redirect Message to the Address ⑤ Initiate Communication with the Main Service Module
Internet
Fig 2. Mechanism of the Addressless Server.
The communication process of howa client successfully initiates a flow to the server.Noted that the client here is just a common IPv6 client. It is configured with an IPv6address, not a prefix. The address used in the whole connection lifetime should remainSeptember 29, 2020 8/33nchanged. Otherwise, it may cause the source address used in the encryption and thesource address used in the decryption different, leading to the failure of the verification.The verification process is performed in each flow. The main service module onlyneeds to perform the verification once for each flow because the source address anddestination address remain unchanged during the flow. Once the connection isestablished, the server can directly accept the following packets until the end of the flow.After a connection is terminated, the client should visit the address of the entrancemodule if it is going to initiate another connection, and the entrance module generatesanother address and redirects the request to this new address.To prevent attackers from intercepting data flows and launching replay attacks, thegenerated address should be different each time. To reach this goal, we add atime-varying factor in the encryption process f (), which is described as salt. In thiscase, when the attacker intercepts the packets and forges an attack message using thesource and destination address from those packets, it will not be effective because thelegitimate destination address is different because the salt has changed.In our model, the entrance module can be deployed anywhere on the Internet,logically and geographically. It can be configured on the same device as the mainservice module, and connected to the Internet through the same first-hop router. Theentrance module can also be configured on different devices in the same subnet, or evena location far from the main server on the Internet. The address allocated to theentrance module can be one of the addresses under the prefix delegated to the mainservice module, or it can be a totally independent global unicast address. Furthermore,considering that the entrance module only provides simple and reproducible services, itis easy to stack or distributed deploy the entrance modules. All in all, the configurationof the entrance module is very flexible. Various strategies can be used on thedeployment of the entrance modules to achieve better results.In the addressless server mechanism, the entrance module is responsible forcalculating the destination address and returning a redirect message to the client. If theserver needs a stricter strategy, the entrance module can also provide userauthentication service, and only replies to the client who passes the authentication.This ensures that all the clients that can perceive the main service module areauthenticated, which further ensures server security without affecting the simplicity.From the above discussion, we can see that our model takes advantage of theredundant space of the IPv6 address by introducing encryption into IPv6 suffixes. TheIPv6 space is very large, and only a small part is actually used in traditional scenarios.As a result, we can ‘waste’ the address space to prevent the server from being scanned.We assign a prefix to the server to free up the suffix space and let it carry theauthentication information. In our model, the use of the prefix and the suffix of thedestination address is different. The prefix is used for routing while the suffix is used forcarrying authentication information to provide additional security benefits. Encryption Algorithm
The encryption in this paper is essentially a signature-verification process. When aclient initiates communication, the entrance module first signs the source address,embeds the result into the destination address, and returns it to the client. Then theclient initiates connections to that address, and the main service module conductsverification using the source address and the destination address. Since we do not needto restore the message, f () here does not need to be reversible.The encryption process f () is described as follows: H SA = Hash ( SA ) (4)September 29, 2020 9/33 SA = Φ( H SA, salt ) (5) DA − = e ( P SA, key ) (6) DA = strcat ( pref ix, DA − ) (7)And the verification process g () is described as follows: H SA = Hash ( SA ) (8) P SA = e − ( DA − , key ) (9) Result = Ψ(
P SA, H SA, salt ) (10)In Eq (4)-Eq (10), SA is the client address (source address of the connectionrequest); DA is the generated address under the prefix of the main service module(destination address of the connection request).In Eq (4) and Eq (8), Hash () is a hash function to convert the source address into a64-bit sequence. A hash function guarantees that the sequence is uniformly distributedin the entire range space. Any hash function is feasible here, including cryptographichash functions such as md5 [44] and SHA-128 [60], or string hash functions such as DJBand BDKR. The string hash function is a better choice here because of the higherefficiency. DJB algorithm is used in our prototype.In Eq (5) and Eq (10), salt is used as the time-varying factor. We add the salt infunction Φ(), which is discussed in the next subsection. Function Ψ() is the verificationfunction determined by Φ().In Eq (6) and Eq (9), e () is the encryption function, and e − () is the decryptionfunction. key is the encryption key.Fig 3 shows the encryption and verification process briefly. Fig 3. Encryption Process and Verification Process.
Theoretically, the encryption algorithm e () can be arbitrary, as long as theciphertext can be held in the suffix space in some way. However, we should consider itfrom two perspectives: security and efficiency. Considering security, a mainstreamencryption algorithm should be used here. There are two categories of encryptionSeptember 29, 2020 10/33lgorithms: the symmetric encryption algorithm and the asymmetric encryptionalgorithm. The symmetric encryption algorithm has the following advantages: toachieve the same security level, it has shorter ciphertext length and key length. Whilethe advantage of the asymmetric encryption algorithm is that it allows the public keynot to be secret. In our model, the key is used only by the entrance module and themain server module, which are both under the control of the service owner. There is noneed to distribute the public key, and it is not a challenge to keep the encryption keysecret, thus there is no need for the asymmetric encryption algorithm. Consideringefficiency, the symmetric encryption algorithm is also better for faster encryption anddecryption processes. As a result, the symmetric encryption algorithm is a better choicein our model, which can ensure a faster encryption/decryption process and a higherlevel of security for a given ciphertext length.The most widely used symmetric encryption algorithms are DES [61], 3DES [61],AES [62], etc. Considering the 64-bit length of the IPv6 suffix, to make the encryptioneasier, DES or 3DES is better here. Although DES is often considered insecure in themodern network environment, it can provide a sufficient level of security in our scenario(discussed in the Security Analysis Section) while it is much faster than 3DES. As aresult, we use DES as e () in our prototype implementation. Nevertheless, our modeldoes not place restrictions on exactly which encryption algorithm is used; that is up tothe choice of the server owner. Salting Algorithm
We discuss the generation of the time-varying factor (salt) in this subsection. To makethe generated addresses unpredictable and the replay attacks ineffective, adding salt isnecessary in the address generation.Stateful salt is the common choice in many salting scenarios. First, we consider if wecan use stateful salt in our model. In this case, the entrance module and the mainservice module save the same state sequence. When an address is generated by theentrance module, it uses the current state as the salt, then hops to the next state. It issimilar in the verification process in the main service module. However, it is not a goodchoice here because synchronization is a challenge. As shown in Fig 4, first client Avisits the entrance module and obtains the redirect address
Address generated withsalt i, later client B obtains the address Address generated with salt i+1. But becauseof different delay, B visits the main service module earlier. At this time, the state in themain service module is still the salt i, thus the packets of B will fail verification and getwrongly dropped. This situation will happen frequently since a public server is generallyvisited simultaneously by a high volume of clients all over the world who face verydifferent network conditions. Synchronization of states between the entrance moduleand the main server module thus become very challenging.To solve this problem, we introduce a novel stateless salting algorithm. In thisalgorithm, the entrance module and the main service module do not save any state.Public information is used instead. System timestamp is a desirable choice among thevarious public information because it changes over time naturally and is extremely easyto obtain. We calculate the salt using the timestamp as equation Eq (11) Salt = (
SystemT ime − T ) /X (11)This equation is similar to the one-time key generation algorithm specified in RFC6238 [63]. T is the initial value and X is the step size. These two parameters are thesame and kept secret in the entrance module and the main service module. Even if theattacker speculates about the possible timestamps based on the current time, he cannotSeptember 29, 2020 11/33 ain Service Module IPv6 Client AIPv6 Client B Entrance Module
Fig 4. Stateful Salt is NOT a Good Choice.
Client B visits the entrance modulelater than client A and gets an address generated with salt i+1. Due to different delay,B visits the main service module earlier than A and fails verification because the mainservice module is expecting salt i.obtain the salt because of the confidentiality of T and X , which further enhances thesecurity of encryption.The salting process is described by Eq (12) P SA = XOR ( Salt, H SA ) (12)In Eq (12), P SA is described in Eq (5).
XOR () is used as the function to add the salt in Eq (12). Here
XOR () can bereplaced by any operations, with the only requirements that: (1) the result should bedifferent if the salt changes and (2) the operation is reversible. The complexity of
XOR () is extremely low, so it is used as the salting function here. This does notintroduce any additional security risks.The server’s verification process is described as the following equations:
P SA = e − ( DA − )) (13) Salt = XOR ( P SA, H SA ) (14) T s = Salt ∗ X + T (15) Result = (
SystemT ime − T s ) ∈ (0 , threshold )? T rue : F alse (16)In the following, we discuss the value of threshold in Eq (16). The time required ingeneral occasions and the error redundancy should be considered here. We assume thatthe timestamp used for encryption in the entrance module is T s and the system timeSeptember 29, 2020 12/33or verification in the main service module is T s , then the time difference ∆ T can bedescribed by Eq (17):∆ T = T trans en + T trans ma + T pro cl + T pro en + T pro ma + T syn (17)In Eq (17): T trans en is the transmission delay of the packet from the entrance module to theclient; T trans ma is the transmission delay from the client to the main service module; T pro cl is the processing time of the client between receiving the redirect message andsending the new request to the main service module; T pro en is the processing time of the entrance module from time stamping to sendingthe message; T pro ma is the processing time of the main service module from receiving themessage to verifying the timestamp. T syn is the system time difference between the main service module and the entrancemodule.It is shown in Fig 5. Main Service Module
IPv6 Client A
Entrance Module
TimestampVerification T pro_en T trans_en T trans_ma T pro_cl T pro_ma ∆T(T syn is not included) Fig 5. Time Elapsed Between Timestamp of Encryption and Verification.
It consists of two transmission delays (entrance module to client, and client to mainservice module) and three processing times (entrance module, client, and main servicemodule). Note the system time difference T syn is not included.In Eq (17), T pro en and T pro ma are usually negligibly small. The entrance moduleand the main service module are both under the control of the server operator, thus thesystem time difference can be minimized and T syn should also be negligible. In this case,Eq (17) is: ∆ T = T trans en + T trans ma + T pro cl (18)That is, the threshold should be at least greater than the sum of the delay from theentrance module to the client, the delay from the client to the main service module, andSeptember 29, 2020 13/33he time required by the client to process the redirect message. This usually varies fromseveral milliseconds to several seconds. Considering redundancy, a threshold of about 10seconds is generally appropriate. A larger threshold brings higher security risks, while asmaller threshold means less redundancy.If the synchronization between the entrance module and the main service module ishard, which means T syn in Eq (17) cannot be neglected, then the verification functionEq (16) should be modified as Eq (19): Result = (
SystemT ime − T s ) ∈ ( − threshold , threshold )? T rue : F alse (19)Here the − threshold is negative, because the system time of the main servicemodule when it performs verification can be earlier than the system time of theentrance module when it performs timestamp. Since the difference in system timesynchronization is usually stable, system operators can adjust threshold and threshold accordingly.However, considering that all modules are controlled by the server owner, timesynchronization should not be a challenge. So on most occasions, we can regard T syn negligible.The salting algorithm is described in Fig 6. Fig 6. Salting Process.
Prefix Length Consideration
Generally speaking, according to RFC 4291 [41] and related RFCs, an IPv6 address hasa 64-bit prefix and a 64-bit interface identification. Although this is not mandatory, it isconsistent with our algorithm. In the above discussion, a 64-bit prefix is allocated to theserver which is used for routing, and a 64-bit IID is used to carry the encryptedinformation.However, our model does not require the prefix length to be 64-bit. Assigning ashorter prefix such as /56 is also feasible. The prefix space is used for routing, so insome cases, the server has to be allocated a longer prefix. For example, the server isbuilt in a /64 subnet, and there are more than one devices in the subnet. In this case, a/68 or /72 prefix can be assigned.A longer prefix means a shorter suffix, thus less space to carry the encryptioninformation. The suffix length cannot be too short, otherwise, security will beSeptember 29, 2020 14/33eopardized. As an extreme example, if the prefix is /120 and the suffix is only 8-bitlong, then an attacker can trivially traverse the entire range space of the ciphertext andhit a legal address with only 256 trials. And some encryption algorithms need to bemodified to shorten the length of the ciphertext when the suffix is shorter than 64-bit.Further, the prefix length is a flexibly adjustable configuration in our model, becauseit is only known by the entrance module and the main service module. It is easy tocooperatively adjust the prefix length by the service operator when the networkconfiguration changes.
Load Balancing and High Availability
In the above discussion, there is only one entrance module and one main service moduleby default. In fact, our model supports multiple main service modules and/or multipleentrance modules without any modification.Multiple main service modules can be deployed smoothly and transparently. Forexample, in Fig 7, four main service modules are deployed to jointly serve the /64address space. In this case, the entire server cluster shares a /64 prefix, but the prefix ofeach device may have any length longer than /64, and each can be of different lengths.The only requirement is that the prefixes of all the devices should cover the entire /64address space, otherwise routing will fail and some packets will not be responded. Notethat overlap is allowed, since the current routing rule of longest prefix matching willensure proper routing. The existence of multiple devices, the number of the devices, andthe respective prefix length of each device are all imperceptible to the outside world,and cannot be obtained by any measurement approaches. This invisibility obviouslyhelps to protect server security.
Main Service Module 12001:da8::/66IPv6 Client
Entrance Module
Main Service Module 22001:da8:0:0:4000::/66
Main Service Module 3
Main Service Module 4
Internet
Fig 7. Multiple Main Service Modules and Random Load Balancing.
In thisexample, four main service modules jointly serve the /64 prefix of the server. Becausethe generated address is uniformly distributed and random enough, the load is evenlydistributed among them automatically.Similarly, multiple entrance modules can also be distributed in different territoriesSeptember 29, 2020 15/33nd different ISP networks to optimize performance. Because the entrance module onlyprovides the function of address generation which is very simple and stateless, it is eveneasier to be arbitrarily stacked in parallel. These entrance modules carry the same keyand the same logic, therefore they generate the same address for the same client at thesame time.This support for multiple main service modules and/or multiple entrance modulesgoes beyond capacity expansion. Further, it provides a new and flexible framework toachieve load balancing and high availability related features.
Random Load Balancing.
Our model achieve random load balancingautomatically without any further modification or configuration. Because the addressesgenerated by the algorithm in the entrance module are uniformly distributed andrandom enough, the load is evenly distributed among the suffix space. If the devices allhave the same prefix length, then the load is evenly distributed among them. Forexample in Fig 7, each device has a /66 prefix, so the load will be evenly distributedamong these four devices. If the devices have different prefix lengths, then the load isdistributed among the devices proportional to the size of the suffix space that is servedby each of them. For example in Fig 7, if we delegate 2001:da8::/67,2001:da8:0:0:2000::/67, 2001:da8:0:0:4000::/66, and 2001:da8:0:0:8000::/65 to the 4devices respectively, then each will share 1/8, 1/8, 1/4, and 1/2 of the load respectively.The advantages of the random load balancing feature in our model include:1. Simplicity. Random load balancing in our model is completely stateless andrequires no scheduling. The devices of the main service module can be arbitrarilystacked, and the load will be automatically distributed among them. To distributethe load in proportion to device capacity is easy. Operators only need to delegatedifferent prefix lengths to the devices and configure routes accordingly.2. Security. Our address generation algorithm is random. The configuration ofdevices of the main service module is completely imperceptible and transparent tothe outside world. Thus attacks against load balancing become ineffective.However random load balancing has the limitation of lack of control. The load isdistributed completely random, thus cannot be controlled according to the status of thedevices. Precisely, it is the number of requests instead of the load itself that isdistributed, and it is statistically even instead of strictly even at any time. Resourcesconsumed by different connections may vary greatly, and random allocation may not beabsolutely uniform. While some devices are temporarily overloaded, others can be idle,resulting in a waste of resources and possible service failure on some devices.
Dynamic Load Balancing.
To overcome the above limitation of random loadbalancing, our model supports dynamic load balancing through two schemes: routingconfiguration and entrance module strategy.One way to achieve dynamic load balancing is based on routing, and all the devicesof the main service module share one /64 prefix. The delegation of sub-prefixes to eachdevice under this prefix and the related routes can be dynamically configured. In thisway, the load of different devices can be adjusted in real-time to balance the utilizationof each device. Caution that during the adjustment the sub-prefixes of all the devicesshould always cover the entire /64 address space.High availability cluster can be naturally achieved in this scheme, such asactive-active cluster and hot standby. When a device is detected to be failing, it can gooffline simply by immediately redistributing its related prefixes and routes to otherdevices, so that the service will not be affected. A self-illustrated example is given inFig 8.Another way to achieve dynamic load balancing is based on the entrance modulestrategy, and each of the devices of the main service module is delegated one /64 prefixSeptember 29, 2020 16/33 ain Service Module 12001:da8::/66 change to 2001:da8::/65
IPv6 Client
Entrance Module
Main Service Module 22001:da8::4000::/66 remove related routes
Main Service Module 3
Main Service Module 4
Internet
Overloaded!
Fig 8. Active-Active High Availability Cluster in Dynamic Load Balancingbased on Routing.
When a device fails, it can go offline by simply redistributing itsrelated prefixes and routes to other devices.instead of sharing one /64 prefix. This means the entire server needs a shorter prefix.Nevertheless, for a server that needs dynamic load balancing, a /48 or even shorterprefix is not a problem [64]. In this case, the entrance module needs to add extrafunction when generating destination addresses. The 64-bit suffix is still generated asdescribed above, but the 64-bit prefix is no longer the fixed server prefix. Instead, theprefix is selected among the prefixes of the cluster devices using load balancingalgorithms, such as round-robin algorithm, least connections algorithm, etc.An example is given in Fig 9. The main service module is composed of 4 devices.Each is configured with an /64 prefix, which is 2001:da8::/64, 2001:da8:0:1::/64,2001:da8:0:2::/64, and 2001:da8:0:3::/64 respectively. Thus the prefix of the entire serveris 2001:da8::/62. The load balancer resides in the entrance module, and it selects amongthese 4 prefixes according to real-time load using a load balancing algorithm. Theentrance module generates the suffix using the encryption algorithm f ( SA ) described inprevious sections, then combine it with the selected prefix to generate the destinationaddress.In this case, the devices of the main service module do not need to be connected tothe same first-hop router, and can be distributed deployed (Note the difference betweenFig 9 and Fig 8). Similar to CDN that is based on DNS redirection, the main servicemodules can be distributed all over the world, and the entrance module is responsiblefor server selection. The optimal main service module can be selected in a fullycontrollable manner using information such as geographic location, network capacity,and the ISP of the client.All in all, our model provides a new and flexible framework that naturally supportsvarious load balancing, high availability, and CDN features. For specific networkscenarios like data center network, the network topology, structure, and routingconfiguration can be further optimized. This framework provides great potential forfuture work.September 29, 2020 17/33 ain Service Module 12001:da8::/64IPv6 Client Main Service Module 22001:da8:0:1::/64 Main Service Module 32001:da8:0:2::/64 Main Service Module 42001:da8:0:3::/64Select Prefix according to Load Entrance ModuleLoad BalancerEntrance ModuleLoad Balancer Generate Suffix using Encryption
Algorithm f(SA)
Fig 9. Dynamic Load Balancing based on Entrance Module Strategy.
Theload balancer resides on entrance module and selects among the prefixes of multiplemain service modules according to load.
Security Analysis
The primary goal of our model is to prevent the server from being scanned. Therefore,we first discuss its anti-scanning feature, then other security features. Finally, weanalyze the big data characteristics of the addresses generated in our model and itsability to resist big data analysis.
Network Scanning
As introduced in the Related Work Section, recent advances of IPv6 scanning can bedivided into two types:1. Scan by collecting active IPv6 address records.2. Scan by generating a hitlist using pattern recognition algorithms.Scanning by collecting active address records does not pose a threat to our model.The main service module is imperceptible to outside. If one of the addresses of the mainservice module is collected, as soon as the flow terminates, the address will no longer beactive. This makes the collected address records completely useless.Scanning by generating hitlists using pattern recognition algorithms does not pose athreat to our model either. According to the discussion in Subsection Big DataAnalysis, the addresses generated by our algorithm do not have any pattern. Thismakes scanning through the generated hitlists no different from brute force scanning.Another way to scan is to collect active addresses inside the subnet. However, asdescribed in previous sections, the main service module does not use any global unicastaddress in the subnet. Attackers cannot get any useful addresses from the subnet.Therefore, this approach cannot pose a threat to our model as well.September 29, 2020 18/33s a result, for our model the above techniques are no different from brute forcescanning, which is generally unworkable in IPv6.However, things are complicated by our salting algorithm, which allows more thanone destination addresses to pass the verification corresponding to the same sourceaddress at the same time. For example, if the step size X in Eq (11) is 5 milliseconds,and the threshold in Eq (16) is 10 seconds, then 2000 legitimate addresses can pass theverification at the same time. Assuming that the suffix length of the main servicemodule is N , there are P legal timestamps within the threshold, then the probabilitythat a random address can pass the verification is enlarged from N to P N . Fig 10shows an illustration, where all addresses from Address 1 to Address 4 can pass theverification. Fig 10. More Than One Legitimate Destination Addresses Can PassVerification.
This is due to the threshold introduced in Eq (16).With the same scanning efficiency of Zmap under IPv4, which takes 45 minutes toscan the 2 address space, the expected time T of the scanning will be T = 3 ∗ N − P (20)To protect the server from being scanned, T should be long enough, for example,longer than 1 year. Then P and N should satisfy Eq (21) N − log P ≥
46 (21)Eq (21) is easy to satisfy. For example, if a threshold of 10 seconds and a step size of1ms are used, the expected scanning time T is about 9 years, which is too long to beworried about. Therefore, the complexity introduced by the salting algorithm does notharm the anti-scanning feature of our model.Our above discussion focus on preventing the main service module from beingscanned. Note that the entrance module is inevitably exposed to scanning, becausesome sort of entry address must be configured into the DNS system. However, first,scanning the entrance module does not pose a threat to the main server, because theaddress that provides the main service has been separated from the entrance module.Second, unnecessary exposure is eliminated through this separation, and the mainservice module, as the main body of the server, is well protected. Third, the entranceSeptember 29, 2020 19/33odule is somehow similar to a bastion host. It provides no business logic, and theconsequences of it being scanned are much more limited. Thus risks are isolated andcontrolled to a smaller scope. DoS Attack
DoS attack is one of the major threats faced by high-profile public servers. We discusstwo typical types of DoS attack here: TCP Syn-Flood and UDP Flood.Syn-Flood does not pose a threat to our model. All messages received by the mainserver are verified, and the ones with inappropriate source address-destination addresspairs will be discarded. This means that the server does not need to send SYN+ACKmessages, nor allocate CPU time or memory to maintain a queue waiting for subsequentpackets.Similarly, for UDP Floods, all the illegal UDP packets will be dropped immediately,and no resources will be allocated. Therefore, neither Syn-Flood nor UDP Flood posesa threat to our model.Admittedly, the mitigation of DoS attacks is also limited. Our model cannot preventthe types of attacks that do not target the server directly, such as bandwidth attacks.Nor can it prevent attacks in which each of the malicious requests first gets a legitimateaddress from the entrance module. In this case, an intrusion detection system can bedeployed on the entrance module to detect and respond to abnormal traffic.Again, the entrance module is still exposed to DoS attacks. However, the entranceprovides limited service with little traffic load, so its ability to withstand DoS attacks ismuch higher. And its function is very simple, so it is easy to be duplicated, stacked, anddistributed deployed, which further enhances its resistance to DoS attacks.
Application Vulnerability Attack
To prevent application vulnerability attacks generally requires software and hardwareupdates in time. However, our model can mitigate this threat by applying stricterauthentication in the entrance module. The entrance module can be configured withauthentication or admission strategies, so that only an authorized user can get thelegitimate address, and the other requests are discarded. In this way, adversaries cannoteven obtain the address of the main service module, let alone exploiting application oroperating system vulnerabilities to launch attacks such as SQL injection attacks.
Replay Attack and Session Hijack
A salting algorithm is introduced in our model to prevent replay attacks. Thedestination address generated by the entrance module is different each time. If anattacker launches replay attacks using previously intercepted packets, the address willalready become illegal.However, there is a complication introduced by the threshold in Eq (16). Thoughthe entrance module dutifully generates a different address each time, a given addresscan pass the verification of the main service module in a time window of length threshold . Thus if an attacker intercepts a packet and launches replay attacks fastenough within threshold , the address will remain valid and pass the verification.It can be solved by requiring the main service module to cache the addresses used inconnections that have just ended within a time window of length threshold , and toreject connection requests with destination addresses that have been used recently.However, this caching is memory intensive for a high-traffic server. New vulnerabilitieslike cache exhaustion attack can be introduced, though the attacker needs to contactthe entrance module first to get a huge amount of legitimate addresses. This strategy,September 29, 2020 20/33ccompanied by an intrusion detection system deployed on the entry module, can be anoption for servers with high security requirements but not too much traffic.Similar to fast replay attacks, the attacker can monitor the connection and launchsession hijacking attacks. This is because verification is conducted only once for eachflow in our model for performance considerations. Once the connection is established,the client and the main service module communicate directly without verification, whichcan be taken advantage of.These two kinds of attacks can both be prevented using encryption at higher layers.Considering that our model is a network layer model, it is orthogonal thus compatiblewith encryption protocols at the transport or application layer, such as TLS [34],DTLS [65], and HTTPs [66]. These protocols can encrypt the packet payload andinvalidate replay attacks and session hijacking attacks.
Key Cracking
For breaking symmetric encryption, the plaintext-ciphertext pair needs to be obtainedto guess the key. Even if an attacker intercepts the messages, the plaintext of thefunction e () in Eq (6) is confidential and cannot be inferred from the source address.This is because the salting algorithm in our model makes the plaintext time-varying.The precise timestamp is difficult to get, let alone the parameters T and X areconfidential. As a result, although DES is considered insecure in the modern Internet, ithas a sufficient security level in our model. ND related Attack
ND attacks are unique in IPv6 because the ND protocol [67] is designed to replace theARP [68] protocol of IPv4. There are many kinds of ND related attacks, which can beroughly divided into DAD attacks [69, 70], spoofing attacks (such as RA spoofing,malicious redirection), and cache exhaustion attacks.Our model allocates a prefix to the main service module instead of an address, andthere is only one device under this prefix. A lot of ND packets used for neighborcommunication are no longer needed. Thus related threats are eliminated, includingrouge NA and NS messages and DAD DoS Attacks [71].On the other hand, ND messages are still used in the prefix allocating process bysome mechanisms such as DHCP-PD. Thus RA spoofing attacks still pose a threat.And link-local addresses are used to communicate with the first-hop router in thesubnet in our model, so ND attacks on the link-local addresses are still effective.Meanwhile, the entrance module still has a fixed address that can be attacked.Therefore, although our model protects the main service module from a lot of NDattacks, it is recommended to deploy ND-related security policies such as SEND [33, 72],RA-guard [73, 74].
Big Data Analysis
In our model, one concern is that the attacker obtains a lot of sourceaddresses-destination addresses pairs by intercepting a large number of network trafficdata, and learns patterns of the generated addresses to launch attacks and scans.However, if the generated addresses have sufficiently good statistical characteristics,that is, random and evenly distributed enough, attackers cannot crack, scan, or attackby big data analysis.We use simulations to demonstrate that the destination addresses generated usingour algorithm have good statistical characteristics. In the simulation, we generate aSeptember 29, 2020 21/33arge number of destination address suffixes then analyze them. Our simulation isconducted in 2 groups, and the results are shown in Fig 11.
Fig 11. Distribution of the Generated Suffixes. (a) is the result of Group 1,where 1000 destination addresses are generated using one fixed source address atdifferent times. (b) is the result of Group 2, where 1000 destination addresses (bluepoints) are generated using 1000 source addresses (red points) collected in real-worldcampus network traffic. The generated suffixes are evenly distributed in both groups.The figure is plotted in the following manner. Since our algorithm generates addresssuffixes, we show the scatter plot of the generated suffixes. The longitudinal coordinateof each point is determined by the 65-96 bits of the address. For example, if the 65-96bits are 0000:0000, then the vertical coordinate is 0; and if the 65-96 bits areFFFF:FFFF, then the vertical coordinate is 1. Similarly, the horizontal coordinate of apoint is determined by the 97-128 bits of the address in the same way.In Group 1, we use a fixed address as the source address, and generate 1000destination address suffixes using it. The salt used in the process naturally varies overtime, so the generated addresses are different. The distribution of the generated suffixesis shown in Fig 11(a). It shows that suffixes generated using the same source addressover time are sufficiently evenly distributed and have no obvious pattern.In Group 2, we collect 1000 active addresses in real-world network traffic fromTsinghua Campus Network. Then we use these 1000 source addresses to generate 1000destination address suffixes. The results are shown in Fig 11(b). The red points are thesuffixes of the source addresses, while the blue points are the suffixes of the generatedaddresses. An interesting observation is that the collected IPv6 addresses have anobvious pattern. A huge number of addresses set all of the 65-96 bits to zero, whileanother huge number of addresses set all of the 97-128 bits to zero. Nevertheless, thesuffixes generated by our algorithm using these collected addresses are still evenlydistributed.Furthermore, to evaluate the randomness of the generated addresses, we calculatedthe entropy of them. This approach is often used to evaluate the randomness of IPaddresses [11–14]. The entropy is calculated for each nybble (a nybble is 4-bit), so thereare 16 entropy values in total for a 64-bit suffix. In probability theory, for a discreterandom variable X with possible values { x , ..., x k } and probability mass function P ( X ), entropy is defined as H ( X ) = − k (cid:88) i =1 P ( x i ) logP ( x i ) (22)For each nybble in our case, there are 16 possible hex characters. From probabilitytheory we know that H ( X ) is maximum if P ( X ) is uniform, so here the maximumSeptember 29, 2020 22/33ntropy is log16. For the k-th nybble, if character c i occurs N i times (assume there areN samples totally), the entropy normalized using the maximum entropy is e k = − (cid:88) i =1 N i N log ( N i N ) (23)The results are shown in Fig 12. Fig 12(a) shows the results of Group 1, andFig 12(b) shows the results of Group 2. The red line is the entropy of the source addresssuffixes, while the blue line is the entropy of the generated destination address suffixes.It is interesting yet reasonable that the collected addresses are not so random, especiallyfor higher nybbles. In both groups, the entropy of the generated suffixes for each nybbleis almost 1. Since the entropy is normalized using the maximum entropy, this meansthat no matter the source addresses are random or not, the generated address suffixesare sufficiently random in all bits. Fig 12. Entropy of the Generated Suffixes. (a) is the result of Group 1, where1000 destination addresses are generated using one fixed source address at differenttimes. (b) is the result of Group 2, where 1000 destination addresses are generatedusing 1000 source addresses collected in real-world campus network traffic. Blue lineshows the entropy of the suffixes of the generated destination addresses, while red lineshows the entropy of the suffixes of the source addresses. In both groups, the entropy ofthe generated suffixes is almost 1, which is the max entropy. The generated addresssuffixes are sufficiently random in all bits.In short, our model can withstand big data analysis attacks launched by obtainingand analyzing large volumes of traffic data.
Prototype Implementation and Experiments
In this section, we introduce our prototype implementation and experiments based onthe prototype.
Prototype Implementation
Our prototype is implemented based on Linux, specifically, Raspbian OS based onDebian. We make modifications to the NetFilter Linux kernel module to implement themechanism. In our prototype, the following features are implemented:1. The server is divided into two modules. The entrance module is configured withan IPv6 address while the main service module is assigned a prefix usingDHCP-PD. The main service module listens on all the addresses under the prefix.September 29, 2020 23/33. The entrance module and the main service module save the same key. When theentrance module receives a request, it generates an address under the main servicemodule prefix and redirects the request to that address. When the main servicemodule receives a request, it conducts verification on the flow.3. The encryption and the verification use DES as e() in Eq (6), and the salt isadded as Eq (11) to prevent replay attacks.The algorithm implemented in the prototype is described as follows. The encryptionalgorithm executed in the entrance module is Algorithm 1, while the verificationalgorithm executed in the main service module is Algorithm 2.
Algorithm 1
Encryption Algorithm Executed by the Entrance module
Input:
SourceAddress,Key,Prefix,T0,X
Output:
DestinationAddress H SA = DJB(SourceAddress) T current = time.currenttime() Salt = (T current - T0) / X P SA = XOR(Salt, H SA) Suffix = DES(P SA,Key) DestinationAddress = strcat(Prefix, Suffix) return DestinationAddress
Algorithm 2
Verification Algorithm Executed by the Main Service module
Input:
SourceAddress,DestinationAddress,Key,T0,X,T threshold
Output:
True/False H SA = DJB(SourceAddress) Suffix = DestinationAddress[64:128] P SA = DES − (Suffix, Key) T current = time.currenttime() Salt = XOR(H SA, P SA) T send = Salt*X + T0 T delta = T current - T send if T delta ¡ T threshold and
T delta ¿ 0 then return True end if return
False
Experiment Environment
Our experiment is built on a /48 subnet in CERNET (China Education and ResearchNetwork). The entrance module and the main service module are configured in thesame subnet. We use DHCP-PD to allocate a prefix to the main service module whilethe addresses of the client and the entrance module are configured manually. We useKea for the DHCP server.We use Raspberry Pis for all the devices in the experiment, and the operating systemof each Pi is Raspbian, which comes with the Raspberry Pi and is based on Debian.The experiment topology is described in Fig 13.Five devices are configured in the experiment subnet. One is a legal client, one is ascanner, one is the entrance module of the server, and one is the main service module.September 29, 2020 24/33 he Entrance
Module Client 2 as a
Scanner
Client 1 as a
Legal ClientDHCP ServerIPv6 Internet
Gateway The Main
Service
Module Traditional
Server as the
Control
Group
Fig 13. Topology of the Experiment Subnet.
To offer a baseline for the experiment, we configure the last one a traditional server asthe control group.
Experiment on Defense of Scanning
In the experiment, the server is configured as a public HTTP web server based on theapache and Django framework. The demo web page displays the source address anddestination address of the visit for the demonstration. We use a client to access theserver through the entrance module first, and the result shows that the client can visitthe main server smoothly, while the client cannot get any response if it tries to visit themain server directly. This shows that the main server cannot be perceived by theoutside world directly.To test the scanning defense feature of the main server, a client is used as theattacker. Three types of scans are tested in the experiment. First, we perform a100-hour brute-force scan on the main service module. The result shows that the scandoes not hit any address.Then, we generate 1 million addresses under the /64 prefix using Entropy/IP and6gen, and use them as the hitlists to scan the server. The scan does not hit any addressas well.Finally, ND information for other devices in the subnet is collected to compose ahitlist to conduct the scan. Similarly, no address can be accessed as well. Therefore, ourmodel can prevent the main service module from being scanned by existing IPv6scanning approaches.
Experiment on Performance
We conduct two sets of experiments on performance in this subsection. First on thedelay in the connection establishment phase, then on the RTT, bandwidth, and jitterafter the connection is established.September 29, 2020 25/33ecause our model introduces additional overhead only in the connectionestablishment phase which influences the delay most, we first conduct experiments onthe delay in the connection establishment phase. We use Wireshark to monitor thedelay. Compared with the control group, the extra delay introduced by our modelequals the time elapses between the entrance module receiving the first packet and themain server module receiving the first packet. This value can be described by Eq (24). T add = T trans en + T trans ma + T process cl + T encryption en + T verification ma (24)In Eq (24), T trans en is the transmission delay from the entrance module to theclient; T trans ma is the transmission delay from the client to the main service module; T process cl is the processing time from the client receiving the redirect message to theclient sending the new request to the main server; T encryption en is the processing timefrom the entrance module receiving the request to it sending the redirect message; while T verification ma is the processing time from the main service module receiving therequest to it completing the verification. We test these five items separately to analyzeto which degree each of them influences the delay.First, T trans en and T trans ma typically vary from several to hundreds of milliseconds.The one-way transmission delay measured in our experiment environment is very small(several milliseconds), but this value is greatly affected by the specific networkenvironment of each client, so our results are not representative and we will not displayit here. However, considering that the additional cost caused by two one-way delays isusually a small number (generally at most a few hundred milliseconds) comparing withthe total access time cost, it may not cause a user-perceivable impact.Second, we test the effect of T process cl . T process cl is dependent on the hardware andthe application of the client. We conducted the experiments in three groups of clientsettings: Group (a) uses Linux and Wget, Group (b) uses Windows and IE 11 Browser,and Group (c) uses Windows and Chrome Browser. Each client launches access to ourprototype website and T process cl is measured in each test using tcpdump/Wireshark.Each group launches access to our prototype website for five times, with an hourinterval between each. The result is shown in Fig 14. Fig 14. T process cl with Different OSes and Browsers. We measure the processtime of the client from receiving the redirect message to it sending the new request tothe main server. It is greatly affected by the environment, but will not exceeds 10milliseconds.From Fig 14, we can see that T process cl is greatly affected by different hardware andsoftware environment, but it is less than 10 milliseconds in all groups. It means that T process cl brings little additional delay and does not influence user experience at all.Finally, T encryption en and T verification ma represent the execution time of theencryption and verification process, respectively. We conduct experiments on theSeptember 29, 2020 26/33ncryption and verification time cost. The experiment is repeated 10 times. The resultis shown in Fig 15. Fig 15. Encryption Time and Verification Time.
The time cost is less than 0.05milliseconds, and its influence to server performance is negligible.From Fig 15, we can see that the execution time of a single encryption or verificationoperation does not exceed 0.05 milliseconds, which is negligible to the delay and will notaffect the server performance.In summary, the additional delay brought by our model is determined by the twoone-way transmission delay, while the sum of the other time overhead will not exceedseveral milliseconds. The total additional delay is acceptable and will not affect userexperience.In order to test the impact of our model on network performance after the connectionis established, we conduct experiments on the RTT, bandwidth, and jitter between theclient and the server. The result is shown in Fig 16. Our model does not bring anyadditional cost in RTT, bandwidth, or jitter once the connection is established.
Fig 16. RTT, Bandwidth, and Jitter Between Client and Server afterConnection Establishment.
There is no performance difference between our modeland a traditional server. Our model brings no additional cost once the connection isestablished.
Discussion
Compatibility, Deployability, and Evolvability
Our model has good compatibility with the current Internet. A client agnostic of ourmodel can visit the server smoothly. And none of the other participants of the Internetneed any modification, such as ISP or DNS servers. Moreover, the modification islimited within the network layer, thus the model transparently supports protocols of thetransportation layer and application layer. All in all, our model requires no transitionmechanism, is simple, and can be easily deployed.September 29, 2020 27/33ot only can our model provide a new perspective on many security and functionalissues including the scanning problem, but also it provides a new and flexible frameworkfor public servers. Various strategies can be formulated based on this framework, suchas user authentication on the entrance module, distributed deployment, and loadbalancing and high availability of the server cluster, etc. We believe that variousexciting models can be proposed based on this framework. Therefore, our model hasgood scalability and potential.
Consideration on IPv6 Address Space
Is it too extravagant that we assign an entire prefix to one server in our model?Although the number of prefixes that can be allocated is much fewer than the numberof addresses in the IPv6 address space, the worry that the IPv6 addresses will beexhausted is unnecessary. All addresses under 2000::/3 are global unicast addresses.There are 8.6 billion global unicast /36 prefixes in total, which means that everyone inthe world can be allocated a /36 prefix, not to mention that there are 268 million /64prefixes under a single /36 prefix. Even if a server is allocated a /56 prefix or even a /48prefix, the current IPv6 address space is still enough. Therefore, assigning a prefix toeach server will not exhaust the IPv6 address space.
Consideration on IPv4
Is our model available for IPv4? Theoretically, yes. In IPv4, the server can also bedivided into an entrance module and a main service module, the main service modulecan also be configured with a block of addresses while the entrance module can alsogenerate a verifiable address using encryption algorithm and provide redirection.However, there are two problems in IPv4:1. In IPv6, we use a 64-bit suffix length to carry the ciphertext. However, in IPv4,the entire address space is only 32-bit. It cannot provide sufficient security level inencryption, resulting in the encryption being easily cracked.2. Different from the high redundancy of IPv6 address space, IPv4 addresses are veryscarce. In IPv4, to allocate a block of addresses to one server is wasteful andliterally too expensive . Therefore, it is unrealistic.Therefore, our model is only recommended to be used in IPv6. Conclusion
In this paper, a novel Internet server model named addressless server is introduced.This model separates the entrance module from the main service module, and usesprefix delegation mechanism to allocate an IPv6 prefix instead of an IPv6 address to themain service module. When a user sends a request to the server, the entrance modulegenerates an address under the main service module prefix by encrypting the useraddress, then redirects the request to the generated address. A time-varying salt isadded in the encryption process to make the address different in each connection. Themain service module verifies each request received and drops all packets that failedverification. In this way, our model can prevent the main server from being scanned andother security threats such as DoS attacks and replay attacks.Our model takes advantage of the massive IPv6 address space to hide the address inuse, and incorporates encryption into IPv6 addresses, which can fully utilize the https://ipv4marketgroup.com/ipv4-pricing/ September 29, 2020 28/33edundant space of IPv6 address. By allocating a prefix to the main service module, ourmodel eliminates the one-to-one correspondence between the server and the IP address.Our model not only shields the server from scanning, but also establishes a new networkframework for servers that supports desirable features such as load balancing,active-active cluster, and CDN conveniently.We implement a prototype of the addressless server and conduct simulations andexperiments based on it. The results show that users can access the server smoothlywhile scan traffic cannot get any response. The addresses generated by the algorithmare sufficiently random and uniformly distributed, which prevents attackers from usingbig data analysis to scan the servers or crack the keys. Our experiments on theperformance show that only during the establishment of the connection does it bringadditional delay, and it does not affect user experience. Once the connection isestablished, our model will not affect delay, bandwidth, or jitter.