niom-turn/docs/architecture/turn_end_to_end_flow.md

146 lines
6.1 KiB
Markdown

# End-to-End TURN Flow (UDP + TLS)
This document describes the **currently implemented** end-to-end flow in `niom-turn`, based on the current MVP code:
- UDP control plane + UDP data plane: [src/server.rs](../src/server.rs), [src/alloc.rs](../src/alloc.rs), [src/stun.rs](../src/stun.rs), [src/auth.rs](../src/auth.rs)
- TLS control plane ("turns") with STUN framing: [src/tls.rs](../src/tls.rs)
- Test builders (how requests are constructed): [tests/support/stun_builders.rs](../tests/support/stun_builders.rs)
## Terms
- **Client**: TURN client (e.g. a WebRTC ICE agent)
- **Server**: `niom-turn` (this project)
- **Peer**: the remote endpoint to relay to (typically another WebRTC endpoint)
- **Allocation**: server-side session providing a **UDP relay socket**
- **Permission**: permission to send to a peer / accept peer packets
- **Channel binding**: mapping `channel-number -> peer` to enable ChannelData
---
## UDP: sequence diagram (happy path)
```mermaid
sequenceDiagram
autonumber
participant C as Client (TURN)
participant S as niom-turn (UDP:3478)
participant R as Relay Socket (UDP:ephemeral)
participant P as Peer
Note over C,S: 1) Allocate without auth → 401 challenge
C->>S: STUN Allocate Request (no MI)
S->>C: STUN Error Response 401 + REALM + NONCE
Note over C,S: 2) Allocate with long-term auth
C->>S: STUN Allocate Request + USERNAME/REALM/NONCE + MESSAGE-INTEGRITY
S->>R: bind("0.0.0.0:0"), spawn Relay-Loop
S->>C: Allocate Success + XOR-RELAYED-ADDRESS + LIFETIME (+ MESSAGE-INTEGRITY + FINGERPRINT)
Note over C,S: 3) Permission (CreatePermission optional)
C->>S: CreatePermission + XOR-PEER-ADDRESS (+ Auth + MI)
S->>C: Success (200) (+ MESSAGE-INTEGRITY + FINGERPRINT)
Note over C,S: 4) Send (client→peer via relay)
C->>S: Send + XOR-PEER-ADDRESS + DATA (+ Auth + MI)
S->>R: relay.send_to(DATA, Peer)
R->>P: UDP payload (source = relay_addr)
Note over P,C: 5) Return path (peer→client)
P->>R: UDP payload (dest = relay_addr)
alt Channel binding exists
R->>S: recv_from(Peer)
S->>C: ChannelData(channel, payload)
else No channel binding
R->>S: recv_from(Peer)
S->>C: Data Indication (METHOD_DATA|INDICATION) + XOR-PEER-ADDRESS + DATA
end
Note over C,S: 6) Optional: ChannelBind + ChannelData
C->>S: ChannelBind + CHANNEL-NUMBER + XOR-PEER-ADDRESS (+ Auth + MI)
Note over C,S: Interop: ChannelBind may implicitly create the permission for this peer
S->>C: Success (200) (+ MESSAGE-INTEGRITY + FINGERPRINT)
C->>S: ChannelData(channel, payload)
S->>R: relay.send_to(payload, Peer)
R->>P: UDP payload
Note over C,S: 7) Refresh
C->>S: Refresh + LIFETIME (+ Auth + MI)
S->>C: Success + LIFETIME(applied) (+ MESSAGE-INTEGRITY + FINGERPRINT)
```
### What the server does (concretely)
- Entry point: `udp_reader_loop` in [src/server.rs](../src/server.rs)
- Early branch: if `parse_channel_data(...)` succeeds, the packet is **not** parsed as STUN; ChannelData is forwarded directly (only if allocation + binding + permission checks pass).
- STUN/TURN requests are parsed via `parse_message(...)` in [src/stun.rs](../src/stun.rs).
### RFC interop note: FINGERPRINT
- All STUN messages built by the server append `FINGERPRINT` as the last attribute.
- If a client includes `FINGERPRINT`, it is validated; invalid fingerprints cause the message to be dropped (no response).
### Auth decisions and common error codes
Auth policy is centralised in `AuthManager::authenticate` in [src/auth.rs](../src/auth.rs).
- **401 Unauthorized**: when `MESSAGE-INTEGRITY` is missing → challenge with `REALM` + `NONCE`.
- **438 Stale Nonce**: when `NONCE` is expired/invalid → new challenge.
- **437 Allocation Mismatch**: when CreatePermission/Send/ChannelBind/Refresh arrives without an allocation.
- **403 Peer Not Permitted**: when a peer is not (or no longer) permitted.
- **400 Missing/Invalid ...**: when required attributes are missing or XOR-PEER-ADDRESS cannot be decoded.
### Interop note: MESSAGE-INTEGRITY in responses
Some clients expect responses to be signed using the same “MESSAGE-INTEGRITY variant” that was accepted for the request.
`niom-turn` therefore derives the mode from the authenticated request and uses it for all responses within the same transaction.
---
## TLS (turns): sequence diagram (control plane)
Important: the TLS implementation in [src/tls.rs](../src/tls.rs) reuses the same TURN handler as TCP and implements a real **stream data plane**.
```mermaid
sequenceDiagram
autonumber
participant C as Client (turns)
participant T as niom-turn (TLS:5349)
participant R as Relay Socket (UDP:ephemeral)
participant P as Peer
Note over C,T: STUN framing over TCP/TLS: read → chunk by (len+20)
C->>T: STUN Allocate (no MI)
T->>C: 401 + REALM + NONCE (over TLS)
C->>T: STUN Allocate + Auth + MI
T->>R: allocate_for(peer, stream-sink)
T->>C: Allocate Success + XOR-RELAYED-ADDRESS + LIFETIME (over TLS)
C->>T: CreatePermission/Send/Refresh/ChannelBind (over TLS)
T->>C: Success/Error (over TLS)
Note over P,C: Peer data returns over the TLS stream
P->>R: UDP payload an relay_addr
R->>T: recv_from(Peer)
T->>C: Data Indication / ChannelData (over TLS)
```
### Consequence
- Control plane over TLS works (Allocate/Refresh/… are answered over TLS).
- The **data-plane return path** (peer → client) also runs over the TLS stream (relay → `ClientSink::Stream`).
More details: [docs/tcp_tls_data_plane.md](tcp_tls_data_plane.md)
---
## Mini checklist: minimal flow (practical)
1. `ALLOCATE` without MI → `401` + `REALM` + `NONCE`
2. `ALLOCATE` with `USERNAME/REALM/NONCE` + `MESSAGE-INTEGRITY``XOR-RELAYED-ADDRESS` + `LIFETIME`
3. Optional `CREATE_PERMISSION` for the peer → `200`
4. `SEND` with `DATA` → server sends to the peer via relay
5. Peer sends back to the relay → server delivers to the client as `DATA-INDICATION` or `CHANNEL-DATA`
6. Optional `CHANNEL_BIND` + ChannelData for a more efficient data plane (may implicitly create the permission)
7. `REFRESH` to extend, or `LIFETIME=0` to release