Anycast Flows

Lecture Outline

• Introduction
• Synchronous P2P Modeling
• P2P Multicast Modeling
• Synchronous P2P Cost Problem
• Other Formulations of Synchronous P2P
• P2P Multicast Network Design Problem
• Concluding Remarks

Lecture Outline

• Introduction

• Synchronous P2P Modeling
• P2P Multicast Modeling
• Synchronous P2P Cost Problem
• Other Formulations of Synchronous P2P
• P2P Multicast Network Design Problem
• Concluding Remarks

Introduction (1)

• In recent years we can observe growing volume

of electronic content in the Internet (music,

movies, e-books, distance learning, software

distribution, etc.)

• More than 50% of network traffic in the Internet

is produced by Peer-to-Peer (P2P) systems

• Currently used P2P systems are not aware of

underlying physical network

• Optimizing of flows in P2P systems can reduce

costs and enable better understanding of these

systems

Introduction (2)

Introduction (3)

P2P systems (1)

• Unstructured

 Centralized (Napster)
 Decentralized (Gnutella 0.4)
 Hybrid (Gnutella 0.6)

• Structured (DHT – Distributed Hash Table)

P2P systems (2)

Taking into account the time scale of the P2P

system we can say about:

• Asynchronous – the time scale is without any

constraints. Research: real systems, stochastic
systems

• Synchronous – the time scale is divided into

slots (iterations). Research: offline models,
simulators

P2P systems (3)

Taking into account the dynamics of the P2P system

we can say about:

• Dynamic P2P systems – peers frequently join and

leave the system, e.g., file sharing systems

• Static P2P systems - peers interested in

participating in the system do not leave the
session so frequently, e.g., P2P multicasting
system applied for dissemination of critical
information, streaming system using STBs (set-
top boxes)

P2P systems (4)

The content to be distributed through P2P

multicasting can be divided into two categories:

• Elastic content (e.g. data files - BitTorrent)
• Streaming content with specific bit rate

requirements (e.g. media streaming, P2P
multicasting)

Lecture Outline

• Introduction

• Synchronous P2P Modeling

• P2P Multicast Modeling
• Synchronous P2P Cost Problem
• Other Formulations of Synchronous P2P
• P2P Multicast Network Design Problem
• Concluding Remarks

Assumptions (1)

• The P2P system works on the top of an overlay

network

• The underlying network is overprovisoned, i.e.

the only potential bottlenecks are access links

• Files to be transferred in the network are divided

into blocks of the same size

• There is an indexing service that provides

information on location and availability of blocks

in network nodes

• This service can be implemented as centralized

(e.g. tracker of BitTorrent) or decentralized (e.g.

DHT)

Assumptions (2)

• The system runs in iterations, i.e. the processing

is divided into timeslots

• Blocks are transferred during time slots, the

transfer of one block between two peers takes

just one time slot

• At the end of each time slot (iteration) the index

service updates the information on location and

availability of blocks

• For instance, if block b was transferred to vertex

(peer) v in time slot t, then all other peers can try

to get this block from v in time slot (t+1)

Example, upload=1

t=0

t=1

t=2

t=3

2,5

3,4,7

Example, upload=2

t=0

t=1

t=2

t=3

2,6

3,4,5,7

Model (1)

indices
b = 1,2,…,B

blocks (pieces)

w,v,s = 1,2,…,V vertices (peers, network nodes)
t = 1,2,…,T

time slots (iterations)

constants
g

= 1, if block b is located in node v before the

P2P

transfer starts (v is the seed); 0,

otherwise

Model (2)

constants
d

maximum download rate of node v

maximum upload rate of node v

large number

variables
y

bwvt

= 1, if block b is transferred to node v from

node w in

iteration t; 0, otherwise (binary)

Model (3)

constraints

+ 



bwvt

= 1,

b = 1,2,…,B v = 1,2,…,V



bwvt

 a

w = 1,2,…,V t = 1,2,

…,T



bwvt

 a

v = 1,2,…,V t = 1,2,

…,T



bwvt

 M(g

+ 

i < t



bswi

), b = 1,2,…,B

w = 1,2,…,V

t = 1,2,…,T.

Lecture Outline

• Introduction
• Synchronous P2P Modeling

• P2P Multicast Modeling

• Synchronous P2P Cost Problem
• Other Formulations of Synchronous P2P
• P2P Multicast Network Design Problem
• Concluding Remarks

Assumptions

• Overlay P2P multicasting (application-layer

multicast) uses a multicast delivery tree

constructed among peers (end hosts)

• Different to traditional IP multicast, the

uploading (non-leaf) nodes in the tree are

normal end hosts

• Peers are connected to the overlay network,

which is considered as an overprovisioned cloud

- capacity constraints are set only on access links

• The graph is fully connected, i.e. each peer can

connect to any other peer

• Various multicast formulations can be used

(e.g., flow, level)

Example

Model (1)

indices
v,w = 1,2,…,V

network nodes

k = 1,2,…,D

receiving nodes (peers)

constants
d

download capacity of node v (kbps)

upload capacity of node v (kbps)

= 1, if link v is the root of streaming; 0,

otherwise

streaming rate (kbps)

Model (2)

variables
x

wvk

= 1, if in multicast tree the streaming path

from the root to node k includes an overlay link

from node w

to node v (no other peer nodes in

between); 0,

otherwise (binary)

= 1, if the link from node w to node v (no

other peer

nodes in between) is in multicast

tree; 0, otherwise

(binary)

Model (3)

constraints


wvk

– 

vwk

= 1, v = k v = 1,2,…,V k = 1,2,

…,K



wvk

– 

vwk

= –1, r

= 1 v = 1,2,…,V k = 1,2,

…,K



wvk

– 

vwk

= 0, v  k r

 1 v = 1,2,…,V k =

1,2,…,K

wvk

 x

v = 1,2,…,V w = 1,2,…,V k = 1,2,…,K



h  d

v = 1,2,…,V



h  u

w = 1,2,…,V



vvt

= 0.

Lecture Outline

• Introduction
• Synchronous P2P Modeling
• P2P Multicast Modeling

• Synchronous P2P Cost Problem

• Other Formulations of Synchronous P2P
• P2P Multicast Network Design Problem
• Concluding Remarks

Synchronous P2P Cost Problem

(1)

• The synchronous model of P2P system is used

(time slots)

• Currently used P2P systems generally choose

logical neighbors at random, therefore they

ignore the underlying Internet topology and ISP

link costs

• Since, P2P system has no impact on the

underlying layer (e.g., IP), the transfers are not

optimized in terms of download paths length

• Consequently, there are many cross-continental

and cross-ISP downloads that can congest

backbone networks and generate additional

operating costs for ISPs

Synchronous P2P Cost Problem

(2)

• Localization of peers and information for costs

can be obtained using:

– IP location databases
– Match IP addresses with the same prefix
– Traceroute records

Synchronous P2P Cost Problem

(3)

• We use a broad-spectrum constant



that is

defined as cost of one block transfer between
peers w and v

• It can be interpreted arbitrarily according to our

needs:

– number of hops between w and v
– number of ISPs between w and v
– RTT between w and v
– distance in kilometers between w and v
– cost of cross-ISP transfers

Synchronous P2P Cost Problem

(4)

• An important characteristic of P2P systems is

dynamics – peers can frequently join or leave
the network

• To model this phenomenon we use constants a

which equals to 1, if peer v in time slot t is
connected to the network (is available) and 0,
otherwise

• Consequently, although the model is

deterministic, the stochastic nature of P2P system
can be incorporated into the considerations

Synchronous P2P Cost Problem

(5)

Given: overlay network topology, peers, seed

location, access link capacity, link availability, link
cost, number of blocks, number of time slots

Minimize: cost

Over:P2P synchronous routing

Synchronous P2P Cost Problem

(6)

indices
b = 1,2,…,B

blocks (pieces)

w,v = 1,2,…,V

vertices (peers, network nodes)

t = 1,2,…,T

time slots (iterations)

constants
g

= 1, if block b is located in node v before the

P2P

transfer starts (v is the seed); 0,

otherwise



cost of block transfer from node w to node v

= 1, if node v is available in time slot t;

0, otherwise

Synchronous P2P Cost Problem

(7)

constants
d

maximum download rate of node v

maximum upload rate of node v

large number

variables
y

bwvt

= 1, if block b is transferred to node v from

node w in

iteration t; 0, otherwise (binary)

Synchronous P2P Cost Problem

(8)

objective
minimize F = 



bwvt



constraints
g

+ 



bwvt

= 1,

b = 1,2,…,B v = 1,2,…,V



bwvt

 a

w = 1,2,…,V t = 1,2,…,T



bwvt

 a

v = 1,2,…,V t = 1,2,…,T



bwvt

 M(g

+ 

i < t



bswi

), b = 1,2,…,B w = 1,2,

…,V

t = 1,2,…,T.

Optimization

• The problem is linear, integer (binary) and

NP-complete (equivalent to MBT (Minimum
Broadcast Time) problem)

• To find optimal solutions solvers like CPLEX and

GuRoBi can be used

• Heuristics can be used, e.g., constructive,

random, evolutionary algorithm, etc.

Algorithms (1)

1  for t=1 to T
2   begin
3    while (IsPossibleTranfer(t))
4     begin
5      v=RandomDownloadPeer(t)
6      b=SelectBlock(v,t)
7      w=SelectUploadPeer(b,v,t)
8      TransferBlock(b,w,v,t)
9     end (if)
10   end (for)

Algorithms (2)

Functions SelectBlock and SelectUploadPeer have

different versions according to a particular strategy:

• Random Strategy (RS) – selects the block and

upload peer at random among all feasible blocks

and upload nodes

• Rarest First Strategy (RFS) function SelectBlock

returns the rarest feasible block in the network, the

upload peer is chosen at random

• Cost Selection Strategy (CSS) – takes into

account transfer costs, the block to be transferred

is selected at random, but the closest, feasible peer

is chosen for upload

• Weighty Piece Selection Strategy (WPSS)

Results (1)

• To solve the model in optimal way we apply

CPLEX 11.0

• Since the goal is to compare results of BitTorrent-

like systems simulations against optimal results,
we had to limit size of the problem instances
in order to obtain optimal results approximately in
one hour

• After several experiments we decide to test

networks consisting of 10 vertices (peers), 3
blocks to be transferred and 4 time slots

Results (2)

• Peers are located in large cities worldwide
• The cost of one block transfer between two peers

located in two cities just as the distance in
kilometers between these two cites

• Three networks were tested:

– E5A5 – 5 peers in Europe and 5 peers in North

America

– E7A3 – 7 peers in Europe and 3 peers in North

America

– E10 – 10 peers in Europe

Results (3)

• Three scenarios of access link capacity:

– asymmetric (upload=2 blocks, download=4

blocks)

– symmetric (upload=2 blocks, download=2

blocks)

– random (upload and download is selected

randomly between 2 blocks and 4 blocks)

Average distance to optimal results for various

networks

Average number of cross-continent transfers

for various networks

Heuristics against optimal results obtained for

network E5A5 as a function of seed location

Optimal results obtained for network E5A5 as a

function of upload capacity

Optimal results obtained for network E5A5 as a

function of peer availability and seed location

Average number of cross-continent transfers for

various networks (100 nodes, 50 blocks, 40 time

slots)

Lecture Outline

• Introduction
• Synchronous P2P Modeling
• P2P Multicast Modeling
• Synchronous P2P Cost Problem

• Other Formulations of Synchronous P2P

• P2P Multicast Network Design Problem
• Concluding Remarks

Download Time Objective

• In the model the download time is represented

by T – a constant denoting the number of time

slots

• If T a is used as the objective function, the

problem becomes a Non-linear Integer Problem

• The following procedure can be employed to find

the minimal value of T

– Set T to some value
– Solve the IP problem consisting of P2P system

constraints (selected among those presented above)

– If the problem has a feasible solution, then decrease T

by 1; otherwise increase T by 1

The download time as a function of availability for

various values of upload capacity

Throughput

• Throughput of the P2P system, which is defined

as the number of blocks (which can be
interpreted as the size of a file) that can be
transferred to every node within given time slots

• In the model this is denoted by B – a constant

denoting the number of blocks

• The procedure to solve the IP problem with the

throughput as the objective can be the same as
in the case of the download time

The system throughput as a function of availability

for various values of upload capacity

Fairness Constraint

• The P2P system should be fair in terms of the

number of blocks served by the individual nodes

• Fairness can be viewed as a kind of an incentive

for nodes to participate, especially in situations
where ISPs charge based on uplink usage or
uplink bandwidth is scarce

• To enforce fairness in the P2P system we add

constraint



bvwt







bwvt

v = 1,2,…,V

• Constant



denotes the fairness of the system

Neighbors Constraint

• In the BitTorrent protocol a new peer joining the

system asks the tracker for a list of peers to
connect to and cooperate with (exchange blocks)

• Such peers are called neighbors
• To model this we introduce a constant e

which

equals 1 if peers w and v are neighbors, 0
otherwise

• Next a following condition can be formulated



bwvt

 Me

w = 1,2,…,V v = 1,2,…,V

Lecture Outline

• Introduction
• Synchronous P2P Modeling
• P2P Multicast Modeling
• Synchronous P2P Cost Problem
• Other Formulations of Synchronous P2P

• P2P Multicast Network Design Problem

• Concluding Remarks

Network Design Problem (1)

• CFA (Capacity and Flow Allocation) problem
• Multiple trees using for streaming
• Level formulation of multicast flows
• Candidate access links
• Each peer has background traffic (download

and upload)

Network Design Problem (2)

Given: network topology, root location, number of

trees, streaming rates, background traffic

Minimize: network cost

Over:P2P multicast routing, link allocation

Network Design Problem (3)

indices
w,v = 1,2,…,V

overlay network nodes (peers)

k = 1,2,…,K

access link types for node v

t = 1,2,…,T

trees

l = 1,2,…,L levels of the multicast tree
constants
a

download background transfer of node v

upload background transfer of node v

Network Design Problem (4)

constants



cost of link type k for node v

download capacity of link type k for node v

(kbps)

upload capacity of link type k for node v

(kbps)

= 1, if node v is the root of the tree; 0,

otherwise

the streaming rate of tree t (kbps)

large number

Network Design Problem (5)

variables
x

wvtl

= 1, if there is a link from node w to node v is

in multicast tree t and w is located on level l of
tree

t; 0, otherwise (binary)

= 1, if node v is connected to the overlay

network by

a link of type k; 0, otherwise

(binary)

objective
minimize F = 





Network Design Problem (6)

constraints




wvtl

= (1 – r

v = 1,2,…,V t = 1,2,…,T



wvt1

 Mr

w = 1,2,…,V t = 1,2,…,T

wvt(l+1)

 

swtl

v = 1,2,…,V w = 1,2,

…,V

t = 1,2,

…,T l = 1,2,…,L – 1



= 1,

v = 1,2,…,V

+ 

 

v = 1,2,…,V

+ 

vw



wvtl

 

w = 1,2,…,V.

Optimization

• The problem is linear, integer (binary) and

NP-complete (equivalent to unicast network
design problem)

• To find optimal solutions solvers like CPLEX and

GuRoB can be used

• Heuristics can be used, e.g., construction

algorithms, Lagrangean Relaxation, evolutionary
algorithm, etc.

Algorithm Create Tree (1)

• Create Tree is a constructive algorithm using the

following functions

• We assume that for each node v = 1,2,…,V the

access link types are sorted according to

increasing values of upload capacity and cost.

• Tree t is feasible on level l if there is at least one

possible transfer from node v (located on level l

of tree t) to node w

• Let ftree(l) return an index of a feasible tree on

level l. If there is more than one feasible tree, we

select the tree with the lowest number of nodes

connected to the tree

Algorithm Create Tree (2)

• Node v is a feasible parent node on level l of

tree t if there is at least one possible transfer in

tree t on level l between v and any other node

• Let fpnode(t, l) return an index of a feasible

parent node located on level l of tree t. If there is

more than one feasible parent node, the

algorithm chooses the node with the largest value

of residual upload capacity

• Let fcnode(v, t, l) return an index of a feasible

child node of node v located on level l of tree t . If

there is more than one feasible child node, again

the residual upload capacity is the additional

criterion

Algorithm Create Tree (3)

• Let istrasnfer(l) return 1 if there is at least one

possible transfer on level l of any tree, 0 otherwise

• If each node v = 1,2,…,V is connected to each tree

t = 1,2,…,T, i.e. all necessary transfers are

completed, function istree() returns 1; otherwise

it returns 0

• Function isupdate() returns 1 if incrementing of

the access link upload capacity is possible for at

least one node; 0 otherwise.

• We assume that first(A) returns the first element

of table A

• Let h = t h

denote the overall streaming tree

rate of all trees

Algorithm Create Tree (4)

• Let function updatenode() return an index of a

node v, for which the upload capacity can be
augmented. If there is more than one such a
node, an additional criterion is applied. In the
algorithm we use several combinations of two
values: node level and relative cost of upload
capacity increase given by the formula
(u

v(k+1)

– u

) / (



v(k+1)

–



)

Algorithm Create Tree (5)

Step 0. Set x

wvtl

= 0 for each w = 1,2,…,V, v = 1,2,

…,V, t = 1,2,…,T, l = 1,2,…,L, v  w.

Set y

as the minimal values that guarantee

sufficient download capacity for each node v
expect the root node (i.e. d

 a

+ h) and the

sufficient upload capacity for the root node r (i.e.
u

 b

+ h).

Algorithm Create Tree (6)

Step 1. Set l = 1. Create table A containing all trees

sorted in decreasing order of streaming rate q

Create table B containing all nodes except the root
node sorted in decreasing order of residual upload
capacity of each node.

a) If A =  go to Step 2. Otherwise, go to Step 1b.
b) Calculate t = first(A), v = first(B) and x

rvtl

= 1.

Next, set A = A – {t} and B = B – {v}. Go to Step
1a.

Algorithm Create Tree (7)

Step 2. If l > L go to Step 3
a) If istrasnfer(l) = 0 set l = l + 1 and go to Step 2.

Otherwise go to Step 2b.

b) Set t = ftree(l), w = fpnode(t, l), v = fcnode(v, t, l)

and x

wvtl

= 1. Go to Step 2a

Step 3. If istree() = 1 stop the algorithm, a feasible

solution exists. Otherwise, go to Step 4.

Step 4. If isupdate() = 0, stop the algorithm, there is

not feasible solution. Otherwise, go to Step 5.

Step 5. Set v = updatenode(). Find k, for which

= 1. Set y

= 0, k = k + 1, y

= 1, l = 1 and go

to Step 2

Results (1)

• We use DSL price lists of four ISPs: two of them

operate in Poland (TP and Dialog) and two other

operate in Germany (DT and Arcor)

• Each node is randomly assigned to one of ISPs

and selects any option included in the price list

• The values of download background transfer

are selected at random between 512 kbps and

1024 kbps

• Analogously the values of upload background

transfer were selected at random between 64

kbps and 128 kbps

Results (2)

• To obtain optimal results we use CPLEX 11.0

solver

• Due to complexity of the problem only for small

networks (20 nodes) the CPLEX can yield
optimal results in reasonable time of
approximately one hour

• The number of multicast trees was in the range

1-4

• The number of levels was in the range 2-8
• We compared the Create Tree algorithm also

against Lagrangean Relaxation algorithm

Heuristic algorithm versus optimal results and

Lagrangean relaxation results

The network cost as a function of tree levels for

various number of multicast trees

The network cost as a function of tree levels for 2

trees and various scenarios of streaming rate (SR)

split

The network cost as a function of the root location

for various values of tree levels

Lecture Outline

• Introduction
• Synchronous P2P Modeling
• P2P Multicast Modeling
• Synchronous P2P Cost Problem
• Other Formulations of Synchronous P2P
• P2P Multicast Network Design Problem

• Concluding Remarks

Concluding Remarks

• Most of P2P systems work in the overlay mode
• The main problem in modeling of P2P systems is

the dynamics in the time scale

• Synchronous models with the time scale have

been presented

• In static P2P systems formulations based on

multicast flows can be used with additional

constraints following from the overlay network

• Optimization of P2P models, especially with the

time scale is very complicated due to the

problem size

Document Outline

Slide 1
Slide 2
Slide 3
Slide 4
Slide 5
Slide 6
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
Slide 12
Slide 13
Slide 14
Slide 15
Slide 16
Slide 17
Slide 18
Slide 19
Slide 20
Slide 21
Slide 22
Slide 23
Slide 24
Slide 25
Slide 26
Slide 27
Slide 28
Slide 29
Slide 30
Slide 31
Slide 32
Slide 33
Slide 34
Slide 35
Slide 36
Slide 37
Slide 38
Slide 39
Slide 40
Slide 41
Slide 42
Slide 43
Slide 44
Slide 45
Slide 46
Slide 47
Slide 48
Slide 49
Slide 50
Slide 51
Slide 52
Slide 53
Slide 54
Slide 55
Slide 56
Slide 57
Slide 58
Slide 59
Slide 60
Slide 61
Slide 62
Slide 63
Slide 64
Slide 65
Slide 66
Slide 67
Slide 68
Slide 69
Slide 70
Slide 71
Slide 72
Slide 73
Slide 74
Slide 75
Slide 76