Categories
Cryptography Security Stupid TLS Facts

Stupid TLS Facts: TLS Resumption

DID YOU KNOW that TLS doesn’t actually have perfect forward secrecy in most circumstances?

Rather, TLS has “conditional future imperfect secrecy” because it would have had perfect forward secrecy without resumption.

a shattered brass metal lock on brown wood

“TLS Session Resumption” is the TLDR of the myth of forward secrecy in TLS.

SSL handshakes are expensive, requiring multiple round trips for key exchange and expensive computations. Enter TLS resumption. With TLS resumption, a session ticket enables both client and server to jump back into their shared session with a single round trip (plus the SYN/ACK round trip) — and the resumption key derivation is a tenth of the cost of the full key exchange computation (per Cloudflare’s resumption performance data).

However, that optimization comes at the cost of forward secrecy: compromising the stored session ticket compromises any past sessions (as well as future sessions, which is already assumed in the TLS threat model, since theft of TLS private key material will compromise any future sessions using that key).

Forward secrecy (or “perfect forward secrecy”, PFS) is the assurance that compromising a system at time T won’t cause a compromise of any sessions/messages before time T.

Once a system is compromised, forward secrecy does not provide assurances about future sessions, which are assumed to be compromised until secrets & keys are rotated (and attacker is removed, etc).

How does TLS provide forward secrecy when resumption isn’t enabled? Through key exchange! The foundation of confidentiality in TLS is through a sequence of cryptographic operations in which the client and server share public values that they each combine with their private values to produce their shared secret (which is cryptographically infeasible for observers to derive from only the public values).

How does resumption violate forward secrecy? If the attacker, Eve, compromises the server after a TLS session, they would have the resumption ticket and the TLS private key. The TLS private key enables Eve to impersonate the server. The resumption ticket enables Eve to breach the confidentiality of past recorded TLS sessions. (Notwithstanding the caveats in the last section.)

TLS Resumption: Stateful & Stateless

TLS resumption comes in 2 flavors, stateful and stateless. The abbreviated handshake for TLS resumption is an extension to the typical ClientHello & ServerHello, enabling the abbreviated handshake to fall back to the full handshake if either side doesn’t have the resumption data.

In stateful resumption, server designates a session ID and both the client server store that ID with the shared secret derived in key exchange. The client can then begin in the ClientHello by including a session ID along with the usual ClientHello (cipher suite, random number, etc). ServerHello response also includes the session ID with the usual ServerHello (cipher suite, random number, etc). Each side combines those random numbers are combined with the shared secret corresponding to the session ID to derive a new shared session secret.

Stateful resumption requires a connection with the same single “server” holding onto those session details, which are usually cached in memory for up to 24 hours.

The plaintext session information or session ticket IS the shared symmetric encryption key (often called the “master secret”) used to encrypt the first TLS session between a client and a server.

Would it be wise to store a different key so that past sessions are protected by forward secrecy? Yes.

Is that how session resumption was bolted onto SSL? Nope!

In stateless resumption, the server stores session details (protocol version, cipher suite, shared secret, and client identity) in an encrypted session ticket, and this session ticket is stored by the client. The client can resume the session by sending the session ticket in the ClientHello.

The session ticket is encrypted by the Session Ticket Encryption Key (STEK), which is only known to the server, and is typically rotated every 1-24 hours (but can be arbitrarily longer, per RFC 5077). Thus, the server only needs to hold on to the STEK, and the client holds onto the session ticket along with the shared secret.

See examples of session resumption packets.

Compromises Are Made

The false promise of TLS’s claim to forward secrecy was that a server compromise would not compromise the confidentiality of previous TLS sessions. However, with TLS resumption, the forward secrecy of past sessions is bound to the lifetime of the session information or Session Ticket Encryption Keys held on the server.

Compromising a server holding session information hands the shared session keys directly to the adversary, Eve. In the case of session tickets, if EVE gets hold of the STEK, they can decrypt the encrypted session ticket to get the same session information. The session information encrypted by the STEK is directly the previous session’s shared symmetric encryption key, and thus Eve can decrypt the encrypted messages of the TLS session.

Either case also requires the adversary to observe and record past sessions (from handshake onwards) — but recall that the maintaining confidentiality of all previous recorded sessions in the event of a compromise is exactly the assurance of perfect forward secrecy.

TLS 1.3 & 0-RTT

TLS 1.3 supports a resumption mode called “0-RTT” (zero round trip time). With 0-RTT, a client can use a Pre-Shared Key (PSK) to send an encrypted HTTP request with the very first message (after SYN/ACK). This is called “early data”.

Because the PSK is used directly as the symmetric encryption key, the early data transmitted by the client is not protected by forward secrecy (much like session tickets in earlier TLS versions). However, the client may start (EC)DHE key sharing in that first message, enabling the server to complete key exchange and send a response that IS protected by forward secrecy. If an attacker breaches the STEKs and has a (recently) recorded message, they can decrypt the PSK and then decrypt the early data — but starting from the server’s response, the attacker will not be able to decrypt later messages that are encrypted using the new shared key derived from (EC)DHE.

If you’re interested in digging into the details of how the TLS 1.3 handshake achieves this, I recommend this C3 talk by Filippo Valsorda and Nick Sullivan.

This sounds great! TLS 1.3 delivers forward secrecy with TLS resumption! Rejoice! Alas, briefly — 

In this document, the keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", and "MAY" are to be interpreted as described in RFC 2119.
https://datatracker.ietf.org/doc/html/rfc2119

That is, the client MAY start (EC)DHE key sharing next to the PSK-encrypted HTTP request in the TLS 1.3 ClientHello. The reality of whether this is applied in practice is…inconsistent.

This (along with systemic oppression) is why we can’t have nice things.

Implementations can choose to require that clients perform key exchange (either in resumption or as a brand new session). For example, BoringSSL requires fresh key material in TLS 1.3 resumption.

0-RTT Replay Attack

Without compromising server nor client, an attacker can replay a 0-RTT message from a client. While the message’s confidentiality and integrity are protected by the PSK, a replay attack can cause a server to perform the previously requested action again, potentially under different circumstances — e.g., “undo the last action” or “open this door” [for the attacker]. Thus, 0-RTT requests should be idempotent (which by definition means “produce the same result regardless of how many times this function is applied”).

Conclusion

You might have wondered when session resumption “officially” became part of SSL or TLS. This is hard to trace, because the practice of storing session keys and using the SessionID in ClientHello extensions for session resumption dates back earlier than the 1996 SSL 3.0 protocol, as captured for historical record by RFC 6101.

Papers analyzing SSL 3.0 refer to session resumption in SSL 2.0. E.g., in Wagner & Scheier’s “Analysis of the SSL 3.0 Protocol”, they specifically call out rollback attacks involving session resumption. The paper finishes with a specific callout around the sensitivity of the master secret:

Ensuring that the master secret remains truly secret is tremendously important to the security of SSL. All session keys are generated from the master secret, and the protection against tampering with the SSL handshake protocol relies heavily on the secrecy of the master secret. Therefore, it is important that the master secret be especially heavily guarded.

Wagner D, Schneier B (1996) Analysis of the SSL 3.0 protocol. In: The Second USENIX Workshop on Electronic Commerce Proceedings, vol 1, no 1, pp 29–40
https://www.usenix.org/legacy/publications/library/proceedings/ec96/full_papers/wagner/wagner.pdf

That being said, forward secrecy was just barely on the radar in the mid-1990s, and primarily from the perspective of compromising months and years of confidential messages after a private key is compromised. The idea of remotely (or physically) compromising a server in order to compromise the confidentiality of a handful of recent sessions was not a relevant threat at that time.

Even today, compromising TLS session resumption tickets remains a negligible threat in the overwhelming majority of threat models.

Welcome to Stupid TLS Facts! If you particularly enjoyed/hated this, why not share it with your friends?

Appendix: Nuances are made

Threat Modeling Conditional Forward Secrecy in TLS

Is the limited nature of the conditional forward secrecy provided by TLS relevant to Alice & Bob? Let’s examine a few scenarios that could be impacted by compromised TLS session information within 24 hours after these events:

  1. Alice browses Wikipedia: No confidential data.
  2. Alice receives an email from her mother: Potentially confidential information. Alice would be surprised by a breach of that confidentiality. Alice probably isn’t a target of an attacker. It’s up to Alice to evaluate her own threat model, but this is almost definitely not a priority.
  3. Alice reads Bob’s free speech and right to privacy blog: No confidential information. Alice should explain to Bob that freedom of speech is a right to expression without fear of government retaliation or censorship, give him 1984 by George Orwell, and teach him how to assess his threat model.
  4. Alice is a CEO of a major corporation who receives an email from her CFO: Likely confidential information. Alice would be surprised by a breach of that confidentiality. Alice is probably the target of an attacker. Could have moderate impact beyond Alice. Alice’s security org might evaluate how their critical services such as email providers configure TLS resumption — but any attacker that is interested in Alice’s corporation probably has more impactful targets in reach if they manage to get resumption keys (either at that corporation or at an email provider) — such as breaching the mail database itself. High effort and high risk for a “moderate” reward at best.
  5. Alice is a CEO of a major corporation whose customers have high expectations for confidential interaction with their services: Alice’s security org should definitely take care in evaluating resumption (and all) key management practices regularly, as well as configuration of TLS resumption. (E.g., BoringSSL requires fresh key material in TLS 1.3 resumption.)
  6. Alice is a leader of a Nation State who receives a confidential email from another Nation State leader, Bob: Definite breach of sensitive confidential information. Alice should definitely know about this risk and work to avoid it. Nation States are definitely high profile targets for many attackers, with potentially broad and severe impact for a breach of confidentiality. Alice is undoubtedly already using protocols supporting backward secrecy (AKA self-healing secrecy providing post-compromise confidentiality), such as Signal’s double ratchet algorithm.
  7. Alice is organizing protests against human rights violations: Alice should absolutely be relying on protocols supporting end to end encryption, full perfect forward secrecy, self-healing secrecy, and more — see the Electronic Frontier Foundation’s guide to Surveillance Self-Defense.

TLS cipher suites lacking forward secrecy

TLS 1.3 only supports cipher suites that provide forward secrecy (aside from the resumption mode without key exchange). Before TLS 1.3, some allowed cipher suites could not assure forward secrecy. These were allowed to enable compatibility

In TLS 1.2, cipher suites without forward secrecy were marked as not recommended. Notably, RSA key exchange is deprecated.

See additional detail in a recent draft RFC by John Preuß Mattsson, “NULL Encryption and Key Exchange Without Forward Secrecy are Discouraged“. This RFC has some excellent references as well, including “Pervasive Monitoring Is an Attack” (RFC 7258).

TLS 1.2 ClientHello & Random Numbers

In TLS 1.2, the client and server each send new random numbers to derive a new shared secret for each session resumption. How is this different from TLS 1.3? Because in TLS 1.2, these random numbers are hashed together with the shared secret to create the new shared secret (an inexpensive operation), whereas in TLS 1.3, these new random values are used in (EC)DHE to derive a new shared secret (a relatively more expensive operation). (c.f., RFC 4346 Security Considerations).

Modern Architecture & Key Management

The reality of modern server architecture is that you’ll have a collection of servers—and more likely, a collection of load balancers terminating TLS at the “edge” of your cloud environment. The days of deriving an encryption key held only in memory by a single server are over: more likely, a set of STEKs will need to be managed across a collection of servers, so that any given server can decrypt the session tickets that another server created for that client.

As recommended by Adam Langley, these session ticket keys should be

  1. Randomly generated and distributed (probably regionally)
  2. Rotated frequently (to limit how severely forward secrecy is compromised)
  3. Held in memory and never written to persistent storage (because the only method to delete data with high assurance is to destroy the storage medium following NIST 800-88 media sanitization guidelines)

TLS 1.3 External PSKs

Earlier, I described Pre-Shared Keys in TLS 1.3 as a value provided by the server at the end of the TLS 1.3 handshake. However, as implied by the PSK naming, TLS 1.3 supports a client designating and using a Pre-Shared Keys in exactly the same way as PSKs are used in IPsec or WPA.

RFC 9257 (“Guidance for External Pre-Shared Key (PSK) Usage in TLS“) addresses the security properties provided by PSKs and the sharp edges around designing key distribution for PSKs.

Session Resumption Predates RFCs

You might have wondered when session resumption “officially” became part of SSL or TLS. This is hard to trace, because the practice of storing session keys and using the SessionID in ClientHello extensions for session resumption dates back earlier than the 1996 SSL 3.0 protocol, as captured by RFC 6101.

Papers analyzing SSL 3.0 refer to session resumption in SSL 2.0. E.g., in Wagner & Scheier’s “Analysis of the SSL 3.0 Protocol”, they specifically call out rollback attacks involving session resumption. The paper finishes with a specific callout around the sensitivity of the master secret:

Ensuring that the master secret remains truly secret is tremendously important to the security of SSL. All session keys are generated from the master secret, and the protection against tampering with the SSL handshake protocol relies heavily on the secrecy of the master secret. Therefore, it is important that the master secret be especially heavily guarded.

Wagner D, Schneier B (1996) Analysis of the SSL 3.0 protocol. In: The Second USENIX Workshop on Electronic Commerce Proceedings, vol 1, no 1, pp 29–40
https://www.usenix.org/legacy/publications/library/proceedings/ec96/full_papers/wagner/wagner.pdf

That being said, forward secrecy was just barely on the radar in the mid-1990s, and primarily from the perspective of compromising months and years of confidential messages after a private key is compromised. The idea of remotely (or physically) compromising a server in order to compromise the confidentiality of a handful of recent sessions was not a relevant threat at that time — and even today it remains a negligible threat in the overwhelming majority of threat models.

Categories
Cryptography

Invisible Salamanders in AES-GCM-SIV

By now, many people have run across the Invisible Salamander paper about the interesting property of AES-GCM, that allows an attacker to construct a ciphertext that will decrypt with a valid tag under two different keys, provided both keys are known to the attacker. On some level, finding properties like this isn’t too surprising: AES-GCM was designed to be an AEAD, and nowhere in the AEAD definition does it state anything about what attackers with access to the keys can do, since the usual assumption is that attackers don’t have that access, since any Alice-Bob-Message model would be meaningless in that scenario.

What is interesting to me is that this property comes up more often than one would think, I ran across it several times now during my work reviewing cryptographic designs, it’s far from an obscure property for real world systems. The general situation these systems have in common is that they involve three parties: Alice, Bob, and Trent. Trent is a trusted third party for Bob, who is allowed to read messages and scan them, with details like when and why depending on the crypto system in question. While Trent and Bob agree on the ciphertextsay because Trent hands Bob the ciphertext or because Alice presents Trent’s signature on itAlice has the option of giving Trent and Bob different keys. The challenge for Alice is to come up with a ciphertext that has a valid authentication tag and still decrypts to different messages for Trent and Bob.

Mitigations

Before I dive deeper into how to construct invisible salamanders for AES-GCM and AES-GCM-SIV, a few words on how to defend against these problems. The easiest option here is to add a hash of the key to the ciphertext. This technically violates indistinguishability, as the identity of the key is leaked, i.e. an attacker now knows which key was used for the message. If indistinguishability is necessary, using the IV as a salt for the hash works well, constructions like HMAC-SHA-2(key=IV, message=key) (i.e. aka HKDF-expand) work well here, as long as attention is paid on whether or not this key hash can be used in any other context. In general, it shouldn’t because the key already should only be used for AES-GCM/AES-GCM-SIV, but real world systems sometimes have weird properties.

Constructing Salamanders

With the mitigation out of the way, onto the fun part: Constructing the messages. In order to understand why and how these attacks work, we first have to talk about \mathbb{F}_{2^{128}} and the way AES-GCM and AES-GCM-SIV use this field to construct their respective tags. As a finite field \mathbb{F}_{2^{128}} supports addition, multiplication, and division, following the usual field axioms. The field has characteristic 2, which means addition is just the xor operator, and subtraction is the exact same operation as addition. Multiplication and division is somewhat more complicated and not in scope for this article, it suffices to say that multiplication can be implemented with a very fast algorithm if the hardware supports certain instruction sets (carryless multiplication). The division algorithm uses the Euclidean algorithm and will at most take 256 multiplications in a naive implementation, so while slower than the other operations, it will still be extremely fast. I will use + for the addition operation and \cdot for the multiplication operation. The most important caveat is to not confuse these operations with integer arithmetic.

AES-GCM

Next, on to AES-GCM. This AEAD is a relatively straightforward implementation of an AEAD that uses a UHF based MAC for authentication. Our IV is 12 byte long, we use a 4 byte counter and CTR mode to encrypt the message. The slightly odd feature is that we start the counter at 2, for reasons we will see later. For authentication, we first derive an authentication key H by encrypting the zero block (This is why we don’t start the counter at zero, otherwise the zero IV would be invalid). Now, using the ciphertext blocks, additional data blocks (both padded with zeros as needed for the last block), and adding a special length block containing the size of the additional data and the ciphertext, we get a collection of blocks, all of which I will refer to as C_i. To compute the tag, we now compute the polynomial

GHASH(H, C, T) = C_0\cdot H^{n+1} + C_1\cdot H^{n}+\dots + C_{n-1}\cdot H^2+C_n\cdot H+T

The constant term, T is the encrypted counter block associated to the counter variable of 1 (Which is why we started at 2 for the CTR mode). Remember that in characteristic 2 + is xor, so we could equivalently say that we compute the polynomial without the constant term and then encrypt it with CTR mode as the first block.

Now, how do we get two different plaintexts to agree on both ciphertext and tag, we first choose two keys and produce the corresponding keystreams, choosing the plaintexts so that the ciphertexts agree (If you want two plaintext that make sense, this part is the hardest step, you first brute force the first few bytes in order to be valid in one format and a comment opening statement in the other, so that you can switch which parts of the ciphertext will appear as valid plaintext and which parts appear as commented out). We leave one ciphertext block open for now, as a sacrificial block that we will modify in order to make the tags turn out to be the same. Next derive the corresponding authentication keys H_1 and H_2 and our constant terms T_1, T_2. This means, we have C_i fixed, except for a specific index, say j, and can now solve

GHASH(H_1, C, T_1) =GHASH(H_2, C, T_2) \sum_{i=0}^n C_i\cdot H_1^{n+1-i}+T_1=\sum_{i=0}^n C_i\cdot H_2^{n+1-i}+T_2 C_j\cdot\left(H_1^{n+1-j}+H_2^{n+1-j}\right)=\sum_{\substack{i=0\\i\neq j}}^n C_i\cdot \left(H_1^{n+1-i}+H_2^{n+1-i}\right)+T_1+T_2 C_j=\left(H_1^{n+1-j}+H_2^{n+1-j}\right)^{-1}\cdot\left(\sum_{\substack{i=0\\i\neq j}}^n C_i\cdot \left(H_1^{n+1-i}+H_2^{n+1-i}\right)+T_1+T_2\right)

by solving for the sacrificial block C_j.

AES-GCM-SIV

So far so good, but, what about AES-GCM-SIV? GCM is famous for having many weird properties that make it extremely fragile, like leaking the authentication key on a single IV reuse, or allowing for insecure tags smaller than 128 bits. In many ways, AES-GCM-SIV is how AES-GCM should look like for real world applications, much more robust against IV reuse, only revealing the damaging properties of an UHF with a reused IV if both IV and tag are the same. This is accomplished through using the tag as a synthetic IV, meaning the tag is computed over the plaintext, and then used as IV for CTR mode to encrypt. Even though this kind of SIV construction uses MAC-then-Encrypt, they are secure against the usual downsides due to CTR mode always succeeding in constant time, independent of the plaintext. This means the receiver can decrypt the message and validate the tag without revealing information about the plaintext in case of an invalid tag. The library needs to take care that the plaintext is properly discarded and not exposed to the user in case the tag does not validate.

The actual IV for AES-GCM-SIV is used primarily derive a per message key. This means that if the IV of two messages is different, both encryption and authentication keys will be unrelated and can not be used to infer things about each other.

All in all AES-GCM-SIV works like this:

  • H, K_E = \operatorname{KDF}(K, IV)
  • T=\operatorname{AES}(K_E, P_0\cdot H^{n+1}+\dots+P_n\cdot H)
  • C=\operatorname{AES-CTR}(K_E, IV=T)

where the plaintext blocks P_i again contain additional data and length, and some extra hardening and efficiency tricks having been stripped for clarity.

Our previous approach of first creating the ciphertext and then balancing things out to get the tags to agree clearly cannot work here anymore. The keystream, and therefore the ciphertext, depend on the tag, so if we want to have any chance of finding a salamander, we have to fix the tag before we do any calculation at all. So after having chosen T, we decrypt it under each of our keys to get the result of our polynomial S_i=\operatorname{AES}^{-1}(K_{E,i}, T). What we are left with is finding plaintexts P_1, P_2 such that

S_i=\sum_{j=0}^n P_{j, i} H_i^{n+1-j}

which gives us a system of two linear equations with 2n unknowns. But this isn’t all constraints we need to satisfy, since we still need to encrypt these plaintexts once we have the tag balanced. Here, we are lucky that everything is over characteristic 2: The CTR encryption is just an addition of the plaintext and the encrypted counter block C_i=\operatorname{AES}(K_E, CB_i)+P_i. To say that two plaintexts result in the same ciphertext under two different keys is just fulfilling the equation

\operatorname{AES}(K_{E, 1}, CB_{j, 1})+P_{j, 1}=\operatorname{AES}(K_{E, 2}, CB_{j, 2})+P_{j, 2}.

This, like our two equations for the tag, is a linear equation. So in the end, for a plaintext that has a size of n blocks, we get n+2 linear equations with 2n variables. This means, in almost all cases, we can construct an invisible salamander with only adding two sacrificial blocks, with the same caveat that the two plaintexts need to be partially brute forced.

Test Code

I’ve put this to the test and have written code to produce AES-GCM (Java) and AES-GCM-SIV (C++) salamanders.

Categories
Cryptography (Incomprehensible)

Cartier Divisors

As an obvious first blog post, easily understandable and very relevant to cryptography (/s), here a description of Cartier Divisors, because Thai asked for it on Twitter.

For this, first some history: A while ago, I taught a Google internal course about the mathematics of elliptic curves. It would probably make sense to start with that content, but I’m going to assume that I’ll come back to it and fix the order later.

Anyways, the objects we are looking at are Divisors and Principal Divisors. The come up when studying curves as a way to describe the zeroes and poles of functions. Over the projective line \mathbb{P}_K^1 (also known as just the base field plus a point at infinity), a rational function (the quotient of two polynomials) can have any selection of zeroes and poles it so pleases, with the only constraint being that there must be (with multiplicity) the same number of zeroes and poles. We can see that by looking at

\frac{(X-a_1)(X-a_2)\dots (X-a_n)}{(X-b_1)(X-b_2)\dots (X-b_n)}

for a function with zeroes at a_1, a_2, \dots, a_n and poles at b_1, b_2, \dots, b_n. If a_i or b_i is \infty, then we ignore the corresponding term, and get a zero/pole at infinity.

On more general curves, we do not have this amount of freedom. The lack of freedom we have in choosing zeroes and poles is tied surprisingly deeply to the curve in question, so describing it turns out to be very useful.

A Weil divisor is a formal sum of points of the curve, that is, we assign an integer to every number of the curve, with all but finitely many points getting assigned the integer zero. The degree of a divisor is the sum of all these integers. The divisor of a function \operatorname{div}f is the divisor we get by assigning the order of the function in that point to the point, i.e. setting it 1 for simple zeroes, -1 for simple poles, and so on. If a divisor is a divisor of a function, we call the divisor a principal divisor.

With these definitions out of the way, we can get to Thai’s question. It turns out that the thing we are interested in is the divisors of degree 0 modulo the principal divisors. This group in some sense measures how restricted we are in our choice for placing zeroes and poles. It turns out, that for Elliptic curves, all divisors are equal to a divisor of the form P - O, with O being the point at infinity (or really any fixed (“marked”) point on the curve) up to a principal divisor (equal up to principal divisor is also called linearly equivalent). So what Thai is asking is that while we can think of principal divisors as a description of rational functions, what are the other divisors? The simple answer to that is that they are just what we said, formal sums of points, just some points with some integer weights. For elliptic curves, they are conveniently in a 1:1 correspondence with the points of the curve itself, which is why we usually gloss over the whole divisor thing and just pretend to add points of the curve themselves. But this answer is kind of unsatisfying, and it does generalize well in higher dimensions or for curves with singularities in them, so a better concept is needed.

Enter Cartier Divisors. In order to explain these, we’re technically going to need sheaves, but sheaves are a bit much, so I’ll try to handwave some things. The basic idea is, since we want to describe zeroes and poles, why don’t we just use zeroes and poles for that? Of course we can’t use a full function that is defined everywhere for that, that would only give us the principal divisors. But locally, we can use a function to describe zeroes or poles. Now what does locally mean? In algebraic geometry, the topologies we’re using are kind of weird. Here, we are using the Zariski topology, which for curves just means that when we say locally, we mean the whole curve with a finite number of points removed. We use this to remove the any zeroes or poles we don’t want for our divisor from our local representative.

All in all that means a Cartier divisor on a curve C is a covering (U_i), i.e. a collection of open sets (curve minus finite amounts of points) such that their union is the whole curve, and a rational function f_i per U_i, defined on U_i. This function’s zeroes and poles are what we understand as the divisor. Obviously, we now need to make sure all these partial functions work well as a whole. We do that by looking at U_i \cap U_j and the functions f_i and f_j restricted to that intersection. If we want this construction to define a consistent divisor, then f_i/f_j can not have any zeroes or poles in U_i \cap U_j. We write this as

f_i/f_j \in \mathcal{O}^\times (U_i \cap U_j)\;\;.

This now describes a consistent global object with zeroes and poles as we want them, getting quite close to describing divisors in a completely different way! We just have one problem, there are way too many functions with a specific pattern of zeroes and poles on our local neighborhood U_i, we need to get rid of all the extra behavior that isn’t just zeroes and poles! To do that, we need to look at two functions f_i and g_i on U_i that have the same pattern of zeroes and poles. What happens when we take f_i/g_i? Well we, as above, get a function without zeroes or poles on U_i. So if we want to forget all that extra structure, we need to take f_i modulo the set of functions without zeroes or poles on U_i. And that’s it.

If we write \mathcal{M}^\times(U_i) for the rational functions that are not equal to zero (so the rational functions that have a multiplicative inverse) and write \mathcal{O}^\times (U_i) for the functions without zeroes or poles on U_i, we can now describe a Cartier divisor as a covering (U_i) together with an element f_i \in \mathcal{M}^\times(U_i)/\mathcal{O}^\times(U_i) such that f_i/f_j\in\mathcal{O}^\times(U_i \cap U_j). A principal Cartier divisor is a Cartier divisor that can be described with just using just the entire curve C as the only element of the covering.

For extra bonus points (which I will not describe in detail here, because this blog post is already way too long and completely incomprehensible), we can look at what happens if we now take these Cartier divisors modulo principal Cartier divisors. It turns out, that the result can be described again with a covering U_i, but this time, instead of going through all that choosing of rational functions per set, we just use the intersections, and choose an element f_{ij}\in \mathcal{O}^\times (U_i \cap U_j), without even looking at rational functions in U_i at all, with some sheafy/cohomological rules for when two of those things are equal.