Recent Advances in Network Computing


Zhiwei Zhao
zhiwei@mobinets.org

Two lines

Recent advances Turing award winners
1. NDN/SDN/NFV 1. Stories
2. Edge computing 2. Interesting works
3. Lectures

Coursework (60%+40%)

A one-column, two-page course work.

Sample pdf, Template tex file.

Attendance

The roll will be called when many are absent.
No final score if you are found absent for ≥3 times.

Submission

Email: fmi_course@mobinets.org
Deadline: May 26, 2022.

Recent advances

NDN: Named Data Networking.

  • An instance of ICN[1][2].
  • Data has names and can be identified.

SDN: Software Defined Networks

NFV: Network Function Virtualization

Edge: Edge Computing

  • Cloud in proximity.
  • Dispersed and resource constrained.

Named Data Networks

Addressing: IP vs ID

[1] Lixia Zhang et al. Named data newtorking, ACM SigComm 2014.

Software Defined Network

[1] N Gude et al. NOX: Towards an Operating System for Networks. ACM SigComm 2008.
[2] N McKeown et al. OpenFlow: Enabling Innovation in Campus Networks. ACM SigComm 2008.

Network Function Virtualization

[1] ESTI, Network Functions Virtualisation, White Paper.
[2] Bo Han et al. Network function virtualization: Challenges and opportunities for innovations, IEEE Comm. Mag. 2015.

Edge Computing

[1] Mahadev Satya et al. The Case for VM-based Cloudlets in Mobile Computing, IEEE Pervasive Computing, 2009.
[2] Mahadev Satya, The Emergence of Edge Computing, Computer, 2017.
[3] J Chen and X Ran, Deep learning with edge computing: A review, Proc. of IEEE, 2019.

Network computing

Centralized🔁Distributed

* 刘云新. 智能边缘计算中的挑战和机遇.中国计算机学会通讯 2021.09. * 汪学海等. 边缘计算技术研究报告. 中科院计算所 信息技术与信息化前瞻FIAR-02. 2018.02.

Grow ripe

Tech. vision

Embedding each other

  • C → N: SDN、NFV
  • N → C: Cloud、Edge、D2D/D4D

Edge enables AIoT

Advances in network brings computing on board.

🏡IoT 🔄Link 🖥Server
Offloading decisions Task transfer Task processing
Analysis/divisions Multi-access Service management

Edge Computing

Why is it named "Edge"?

Similar concepts

Cloudlet
Fog
MEC

Architecture

Cloud-edge-end (云边端)

Architecture

Cloud-edge-end (云边端)

Architecture

Cloud-edge-end (云边端)

Architecture

Tecent's attempt

* 算力网络? 算网融合?

Application requirements

Type Latency QoE
AR High <10ms
VR High <10ms
Game Fair 1-30ms Perfect
31-50ms Satisfactory
51-100ms Acceptable
100-200ms Poor
200-500ms Unsatisfactory
>500ms Unacceptable
Voice call 0-100ms Good
100-250ms Fair
250-1100ms Unacceptable
Video streaming Low 0.2-2s Good
2-6s Fair
6-18s Poor
Web browsing Low 0-350ms Good
400-700ms Average

Differences from cloud

Ultra-low latency

Scenario Comm. Latency Energy
Mobie computing 5G/6G 5ms High
Mobie computing 4G 50ms High
Indoor WiFi 150ms Fair
Indoor Bluetooth 200ms High
Embedded ZigBee 1s Low
Embedded LoRa 2s Low
Embedded NB-IoT 2s Low
Embedded Sigfox 2s Low

Heterogeneity

Heterogeneous devices directly connected

Wireless communications

WiFi Bluetooth ZigBee

Service types

Face recognition Object tracking Voice identification
Remote sensing Event detection Fed-server

Limited resource

Product CPU GPU Memory Storage Perf
Jetson Nano Cortex-A57 (4core) @1.43GHz 128core Maxwell @921MHz 4GB LPDDR4 16GB 472 GFLOPS
Jetson TX2 Cortex-A57 (4core) @2GHz
nVidia Denver (2core) @2GHz
256core Pascal @1300MHz 8GB LPDDR4 32GB 1.3 TFLOPS
Jetson AGX Xavier nVidia Camel ARMv8.2 (8core) @2.26GHz 512core Volta @1377MHz + 64 Tensor cores 32GB LPDDR4 32GB 32 TOPS
华为昇腾Atlas Unknown Unknown 22TOPS
海思Hi3559A Cortex A73+A53+A51 @1.6+1.2+1.0GHz 4GB 8GB 4TOPS
Cloud 32core CPU Titan 24G 64GB 24TB 375GOPS

Mobility sensitive

Direct connection

Limited coverage

More mobile devices

🚘 Auto driving
🚶 Pedestrians
🤖 Mobile robots

Mobility

Privacy preserving

Data does not go far

More complex crypto methods can be supported by edge enhancement

Novel paradigms like Federated Learning

Edge-based vs Cloud-based
Centralized vs Decentralized

Driving force

Importance

Enabling anywhere computing

Energy efficient

2000 2005 2007 2020
700亿度 1525亿度 3300亿度 >1万亿度 *三峡大坝一年发电量1000亿度

Change way of life

Everything now can be smart

Transparent computing

*透明计算的争议
*Network Operating System (Disaggregated OS)
*Remote Direct Memory Access

Much more secure

Edge servers can be owned by user

Data can never go out of edge

Edge allows more secure and privacy preserving paradigms

Recent papers

IoT+Edge (AIoT)

Hottest term in the statistics

IoT access to Edge services

Low-power cosntraints
Pervasive interference
Massive, diverse and heterogeneous

5G+Edge computing

Ultra-low latency

Ultra-small coverage

5G network is meaningless without edge computing.
---- Many experts

Virtualization

Service on COTS hardware

Various techs

Instruction level: Bochs, Qemu
Hardware abstraction: Hypervisor (all resources), Hosted (no resources), Hybrid (program and memory); XEN/KVM/VMware.
OS level: chroot, Linux Vserver, Open VZ, LXC, Docker
Library level: Wine (OSX), WSL
Programming language level: JVM, CLR (common language runtime)

Offloading

"Offload" tasks from devices to edge servers

Decision, gaming, transfer, etc.

Offloading (1)

0-1 Offloading decision

Offloading whole tasks, regardless the dependencies.

Offloading (2)

Sub task offloading

Offloading the (bottleneck) sub-tasks

Offloading (3)

Offloading many sub-tasks with dependencies

Offloading (4)

Reliable and timely task transfer

Offloading (5)

Offloading - Processing pipeline

Establish pipelining and avoids time wastes at the edge servers.

Offloading (6)

Architectural offloading

Resource allocation

"Allocate" resources to services (VMs)

Optimization problem, scheduling strategies

Integer Programming

Service cache

Overhead, hitrate, request amount, etc.

Knapsack problem

Edge AI

Lightweight training/inference

TinyML, Pruning, FPGA

Federated Learning

Distributed learning

Model based, data based

Summary

Cloud/Edge End device Interactions
Service framework, task scheduling, resource allocation, server deployment, service caching, internetworking, service routing, distributed storage, etc. Task partitioning, offloading decision, call graph analysis, structure analysis, service-aware compiling, lightweight AI, etc. Bulk data transfer, multi-access, service discovery, channel allocation, offloading game, etc.

Some more sys

Orbital Edge Edge infrastructure VANET Edge UAV Edge

OEC (1)

OEC (2)

Inter-sat pipeline

Frame-level, block-level vs Sparse, Gathered

Federated Learning (1)

Can data live at the edge?

Improved latency
Works offline
Better battery life
Privacy advantages

on-device inference for mobile keyboards and cameras.

Federated Learning (2)

Three choices back in 2014:

Don't use data to improve products
Log the data centrally anyway
Invent a new solution, Federated Learning

Federated Learning (3)

Model engineer workflow

Federated Learning (4)

Model engineer workflow

Federated Learning (5)

Model deployment workflow

Federated Learning (6)

Model engineer workflow

Federated Learning (7)

FL workflow

Federated Learning (8)

FL workflow

Federated Learning (9)

FL workflow

Federated Learning (10)

FL workflow

Only-in-aggregate Engineer may only access combined device reports

Federated Learning (11)

FL workflow

Federated Learning (12)

FL workflow

100-1000s of users per round, 100-1000s of rounds to convergence, 1-10 minutes per round

Federated Learning (13)

    Data locality and distribution
    1. massively decentralized, naturally arising (non-IID) partition
    2. Data is siloed, held by a small number of coordinating entities
    3. system-controlled (e.g. shuffled, balanced)
    Data availability
    1. limited availability, time-of-day variations
    2. almost all data nodes always available
    Addressability
    1. data nodes are anonymous and interchangeable
    2. data nodes are addressable
    Node statefulness
    1. stateless (generally no repeat computation)
    2. stateful
    Node reliability
    1. unreliable (~10% failures)
    2. reliable
    Wide-area communication pattern
    1. hub-and-spoke topology
    2. peer-to-peer topology (fully decentralized)
    3. none (centralized to one datacenter)
    Distribution scale
    1. massively parallel (1e9 data nodes)
    2. single datacenter
    Primary bottleneck
    1. communication
    2. computation
    Open issues
    1. Low convergence
    2. Averaging
    3. Client selection
    4. Clustered edge network

Talk with Satya