Step 4 introduces the most dangerous subsystem in embedded firmware: networking. Not because it is complex, but because it pretends to be helpful.
Part A – English Version
Why Step 4 Is Where Architectures Go to Die
Until now:
- sensor failures were local
- timing was predictable
- control flow was obvious
Networking changes everything:
- operations block for long, unknown times
- partial success is common
- libraries hide retries
- callbacks tempt you to put logic in the wrong place
Most firmware architectures quietly collapse at this step.
The Lie Networking APIs Tell You
Most networking APIs are designed for applications, not devices.
They imply:
- the network is usually available
- failure is exceptional
- retrying is helpful
For an embedded device, all three assumptions are wrong.
Blocking Is Not the Enemy
Junior engineers often believe:
“Blocking calls are bad. We must be async.”
This is backwards.
Blocking calls are:
- honest
- explicit
- easy to reason about
Async calls hide:
- who owns control flow
- when retries happen
- where errors propagate
So in Step 4, we intentionally choose blocking networking.
What We Actually Need from Networking
From the application’s point of view, networking is simple:
- initialize network stack
- send one measurement
- possibly fail
That’s it.
Everything else is policy.
Network as a Subsystem
We treat networking the same way as the sensor:
- minimal API
- no retries
- no sleeping
- no decisions
The application remains in charge.
Network Context
struct net_ctx {
int sock;
};
Nothing fancy.
Minimal Network API
int net_init(struct net_ctx *ctx);
int http_send_temp(struct net_ctx *ctx, int32_t temp_mdeg);
Return values:
0on success- negative errno on failure
No callbacks.
Blocking HTTP Example (Zephyr)
int http_send_temp(struct net_ctx *ctx, int32_t temp_mdeg)
{
char payload[64];
snprintk(payload, sizeof(payload),
"{\"temp\": %d}", temp_mdeg);
int ret = send(ctx->sock, payload, strlen(payload), 0);
if (ret < 0) {
return -errno;
}
return 0;
}
This function:
- blocks
- either succeeds or fails
- tells the truth
Integrating NET_INIT into the State Machine
case APP_STATE_NET_INIT:
ret = net_init(&ctx->net);
if (ret < 0) {
ctx->last_error = ret;
ctx->state = APP_STATE_ERROR;
break;
}
ctx->state = APP_STATE_IDLE;
break;
The network does not retry itself.
Integrating SEND
case APP_STATE_SEND:
ret = http_send_temp(&ctx->net, ctx->last_temp_mdeg);
if (ret < 0) {
ctx->last_error = ret;
ctx->recovery_state = APP_STATE_NET_INIT;
ctx->state = APP_STATE_ERROR;
break;
}
ctx->state = APP_STATE_WAIT;
break;
Notice something important:
- SEND does not retry
- SEND does not sleep
- SEND does not reconnect
It reports. The application decides.
The Introduction of recovery_state
This is a critical addition:
enum app_state recovery_state;
Why?
Because after failure, we need to answer:
“Where do we resume once the failure is handled?”
This is not the same as:
“What failed?”
Separating these concepts keeps logic clean.
Why We Do NOT Jump Directly to NET_INIT
Tempting but wrong:
ctx->state = APP_STATE_NET_INIT;
This bypasses:
- centralized error handling
- retry policy
- backoff
- logging
With recovery_state, all failures go through ERROR and WAIT.
What Step 4 Deliberately Avoids
Step 4 does NOT:
- use callbacks
- retry inside networking
- manage timeouts dynamically
- optimize throughput
Those come later, if needed.
A Reviewer’s Perspective
A reviewer can now clearly see:
- where network failures occur
- how they propagate
- who decides recovery
No hidden magic.
Final Thought (English)
Networking should tell you the truth, even when that truth is uncomfortable.
Blocking APIs help you keep control.
Part B – Phiên bản tiếng Việt
Vì sao Step 4 là nơi kiến trúc thường sụp đổ
Network làm mọi thứ phức tạp hơn:
- block lâu
- lỗi không rõ ràng
- thư viện “giúp đỡ” quá nhiều
Nhiều firmware chết ở đây.
Lời nói dối của API mạng
API mạng thường giả định:
- mạng ổn định
- lỗi hiếm
Thiết bị thì ngược lại.
Blocking không xấu
Blocking:
- rõ ràng
- dễ suy luận
Async thường:
- giấu logic
- giấu ownership
Vì vậy Step 4 chọn blocking trước.
Network như một subsystem
- API tối thiểu
- không retry
- không sleep
Application quyết định tất cả.
Tích hợp vào state machine
SEND chỉ:
- gọi
- nhận kết quả
- báo lỗi
recovery_state là chìa khóa
Nó trả lời:
“Sau khi xử lý lỗi, quay lại đâu?”
Không phải:
“Lỗi gì xảy ra?”
Vì sao không nhảy thẳng NET_INIT
Nhảy thẳng sẽ phá:
- retry policy
- backoff
- logging tập trung
ERROR + WAIT là bắt buộc.
Step 4 KHÔNG làm gì
- không callback
- không tối ưu
- không async
Cố ý giữ đơn giản.
Lời kết (Tiếng Việt)
Network tốt là network nói thật, dù sự thật đó khó chịu.
Blocking giúp bạn giữ quyền kiểm soát.