Server builds, wallet signs, webhooks confirm

A Web3 checkout flow lets a user complete a purchase by sending funds on-chain—no cards, banks, or intermediaries. When designed correctly, it’s non-custodial: your server never holds private keys. Instead, it controls the amount, recipient, and audit trail without touching user funds. This post shows how to build that on Solana, with patterns applicable to any chain.

Web3 checkout — system architecture

Why build the transaction server-side?

A client-side payment flow has no server-controlled state before the transaction is submitted. The amount, the recipient, the intent — all of it lives in the browser. Your server is a passive observer, reconciling after the fact.

This creates three concrete security problems.

No pre-authorisation. There is no server-verified record of what should be paid before the wallet signs. A compromised client can construct a transaction for any amount to any address.

No fraud detection surface. Without a pending record created server-side before submission, you have nothing to validate the on-chain result against. You learn what happened, not what was supposed to happen.

No audit trail. On-chain data tells you a transaction occurred. It does not tell you which user initiated it, which checkout session it belongs to, or whether the amount matches what was agreed.

Building the transaction server-side significantly reduces these gaps. Your server derives the recipient address from your database — not from the client. It sets the amount. It embeds a unique identifier in the transaction itself (for example, in instruction data or a memo field) that permanently links the on-chain event to a record in your database. It creates a pending entry before the wallet is ever involved.

The client receives an unsigned transaction. The wallet signs it. The private key never leaves the browser. The non-custodial guarantee holds. But now your server controls the terms of every payment before it happens.

Threat model

This pattern assumes the client cannot be trusted and external inputs may be malicious or unreliable. In particular, we design against:

A compromised or modified client. The browser can construct arbitrary transactions, alter amounts, or attempt to redirect funds.
Cross-site request forgery (CSRF). A third-party site may attempt to trigger authenticated requests to build transactions without user intent.
Concurrent or replayed requests. Multiple requests using the same session or token may arrive simultaneously or be retried.
Untrusted webhook delivery. Webhook payloads may be spoofed, replayed, or delivered multiple times. Each incoming request should be authenticated by verifying the Authorization header against a pre-shared secret, and handled idempotently to guard against duplicate delivery.
Missing or delayed client signals. The client may fail to report the transaction signature, or report it out of order relative to on-chain events.

The system is designed so that no single client action, webhook event, or race condition can cause funds to be redirected or records to be incorrectly confirmed.

The checkout session

A checkout session is a short-lived, server-controlled record of payment intent. Before the client builds anything or the wallet is involved, your server creates a session that captures the agreed terms: who is paying, who is receiving, how much, and in which currency.

This is the foundation of the server-side approach. The session is the source of truth. When it comes time to build the transaction, the server reads from the session — not from the client.

const session = {
  id: receiverId,
  amount: parsedAmount,
};
await kv.set(["checkout", sessionId], session, { expireIn: FIVE_MINUTES_MS });

The session is stored server-side with a short TTL — ten minutes in this implementation. Any key-value store works here: Deno KV, Redis, or a database row with an expiry timestamp. The important thing is that the session lives on the server, not in the client.

CSRF protection

Cross-Site Request Forgery (CSRF) is an attack where a malicious page triggers an authenticated request to your server on behalf of a user — without their knowledge. In a payment context, this is particularly dangerous. A CSRF attack against a checkout flow could trick your server into building a transaction the user never intended to sign.

The standard defence is a CSRF token: a secret tied to the session that must be present on subsequent requests. Because the token is only accessible to the legitimate client that created the session, a malicious third-party page cannot forge the request.

TOCTOU — the race condition problem

A standard CSRF token has a subtle weakness in high-concurrency environments. The check and the invalidation are two separate operations. If two requests arrive simultaneously, both carrying a valid token, both can pass validation before either invalidates it — a Time-of-Check to Time-of-Use (TOCTOU) race condition. For a payment flow, that means a single checkout session could be used to build multiple transactions.

The fix is atomic invalidation. Rather than checking and deleting the token in two separate operations, you do both in a single atomic transaction:

const commit = await kv.atomic()
  .check(sessionEntry)
  .set(["checkout", sessionId], updatedSession, { expireIn: TEN_MINUTES_MS })
  .commit();

if (!commit.ok) {
  return c.json({ message: "Conflict detected, please try again" }, 409);
}

The .check() asserts that the session has not been modified since it was read. The .set() invalidates the token. If another request races in and modifies the session between your read and your write, .commit() fails — and the second request is rejected.

One session. One transaction. No exceptions.

Building the transaction server-side

The following implementation uses Solana and @solana/kit — the modern, functional TypeScript SDK for Solana. If you're not familiar with it, Anza's documentation covers creating instructions, building a transaction, and sending a transaction.

Once the checkout session exists, the server has everything it needs. The client has no input into this step beyond presenting a valid session ID and CSRF token.

The server builds a transaction using @solana/kit — setting the fee payer, the blockhash lifetime, the compute budget, and the transfer instruction. Three things are worth understanding about how this is structured:

The recipient address is derived from the database — not from the client. The server looks up the receiver's wallet address from the session. The client cannot influence who receives the funds.
The compute budget is set explicitly rather than estimated. In a payment context, predictable fees matter — you do not want a transaction to fail or cost an unexpected amount at signing time.
A unique identifier is embedded in the transaction. This identifier maps directly to the pending database record created before submission — permanently linking the on-chain event back to your system, even if your confirmation webhook fails.

The server compiles the message and returns an unsigned transaction. It has constructed every detail, but it cannot submit it. Authorisation requires the sender's private key, which never leaves their browser.

The payment intent record

Before the unsigned transaction leaves the server, a pending record is written to the database. This is not a nice-to-have — it is the foundation of your audit trail. The concept is borrowed from traditional payment systems — Stripe calls it a PaymentIntent: a server-side record that tracks the lifecycle of a payment from creation through confirmation.

The record captures the agreed terms before the user acts. If the wallet signs and submits but your confirmation logic fails, you still have a record of the intent — tied to the session, the amount, and the identifier embedded in the transaction.

Confirming the record

Once the transaction is submitted by the client, your server needs to know when it lands on-chain. Polling RPC endpoints for transaction status is inefficient and difficult to scale reliably. In practice, this is typically handled using webhooks.

A webhook provider monitors the chain and delivers a POST request to your server when a transaction involving your program is confirmed. You configure the subscription with your program address, and the provider notifies your backend as events occur.

That said, webhook delivery is just another external input — it must be treated as untrusted. When configuring your webhook provider, set a shared secret and verify it on every incoming request — most providers send it as an Authorization header. Your server should reject any request where the header is missing or does not match. Deliveries should also be handled idempotently, as duplicate or delayed events are common.

When a webhook is received, your server identifies the pending record via the on-chain identifier.

Using that identifier, the server:

looks up the pending record
verifies the on-chain recipient and amount match the expected values
updates the record status to CONFIRMED

This identifier is the critical link between your database and the blockchain. Without it, you are reduced to matching transactions by amount and timing — which is fragile under load and difficult to secure.

It is also important that confirmation does not depend on the client. The client may fail to report the transaction signature, report it late, or not report it at all. The webhook becomes the source of truth, while client-reported data is treated as a hint or optimization, not a requirement.

Web3 checkout flow — pending record lifecycle

What the client does

The client's role in this flow is intentionally minimal.

It receives the unsigned transaction from the server and passes it to the user's wallet — Phantom, Backpack, or any compatible Solana wallet — and waits for the user to approve it.

The wallet decodes the transaction, displays the details to the user, and prompts for approval. If the user confirms, the wallet signs it with their private key and submits it directly to the Solana network. The signed transaction is not sent back to your server — it is submitted directly to the network.

Once submitted, the client sends the transaction signature back to your server so it can match the on-chain event to the pending record when the webhook fires.

The private key is used once, locally by the sender, and never transmitted. That is the non-custodial guarantee in practice.

Security considerations

The security of this pattern rests on several compounding layers.

Session TTL. Checkout sessions are short lived, enforced at the key-value store level — not application logic. An expired session cannot be used to build a transaction.

CSRF protection. A one-time token is tied to each session and consumed atomically on use. Concurrent requests cannot both succeed. A malicious third-party page cannot forge the request.

TOCTOU prevention. The atomic .check().set().commit() pattern ensures the token can only be used once, even under concurrent load.

Server-controlled recipient. The receiver's wallet address is derived from the database at transaction build time — not supplied by the client. A compromised client cannot redirect funds.

Explicit compute budget. Transaction fees are set server-side and predictable. There are no surprises at signing time.

Amount validation. The on-chain amount is verified against the pending record on confirmation. Mismatches are flagged immediately.

Rate limiting. Session creation and transaction build endpoints are rate limited independently, reducing the surface for abuse.

Trust assumption. This pattern assumes a non-malicious server. The wallet is the user's last line of defence — compliant wallets like Phantom decode and display all transaction instructions before signing, surfacing any unexpected calls. A commit-reveal scheme would add a cryptographic guarantee but shifts complexity to the client-side code. For most production deployments, the wallet inspection layer is sufficient.

Together these controls apply a pattern familiar from traditional payment systems — the Payment Intent — to a non-custodial Web3 context. The server defines the terms of each payment and the client authorises. The blockchain settles. Webhooks reconcile.

None of this requires custody of user funds or private keys. The security comes from server-controlled state — not server-controlled keys.

Server builds, wallet signs, webhooks confirm — A Web3 checkout pattern

Why build the transaction server-side?

Threat model

The checkout session

CSRF protection

TOCTOU — the race condition problem

Building the transaction server-side

The payment intent record

Confirming the record

What the client does

Security considerations

Reference

Why build the transaction server-side? #

Threat model #

The checkout session #

CSRF protection #

TOCTOU — the race condition problem #

Building the transaction server-side #

The payment intent record #

Confirming the record #

What the client does #

Security considerations #

Reference #

Why build the transaction server-side?

Threat model

The checkout session

CSRF protection

TOCTOU — the race condition problem

Building the transaction server-side

The payment intent record

Confirming the record

What the client does

Security considerations

Reference