Char | mikePietsch.com

5.8 Array Type: Fixed-Length Homogeneous Collections

Right, arrays. Let’s talk about the workhorse, the trusty (if sometimes rigid) container that you’ll use more often than you’ll check your email. An array is the simplest way to say, “I need exactly N things of exactly the same type, and I want them right next to each other in memory.” It’s a fixed-length, homogeneous collection. Let’s unpack that jargon. Fixed-length means you declare its size upfront and that’s it. No take-backsies. The array’s length is part of its type signature. A [i32; 5] is a completely different and distinct type from a [i32; 6]. This is Rust being its usual brutally honest self with the compiler: it needs to know exactly how much stack space to allocate for your array, and it can’t do that if the size might change.

5.7 Tuple Type: Grouping Heterogeneous Values

Right, so you’ve met arrays, which are all about order and homogeneity. A tuple is its delightfully messy cousin: a fixed-size collection where each position can have its own, specific type. It’s the data structure you reach for when you need a quick, lightweight grouping of disparate items without the ceremony of defining a full-blown struct or class. Think of it as a minimalist, type-safe bag for your values. The Anatomy of a Tuple You create a tuple by simply wrapping a comma-separated list of values in parentheses. The magic, and the entire point, is in its type signature. Let’s look at a classic example: a function returning an HTTP status code and a message.

5.6 char: A Four-Byte Unicode Scalar Value

Now, let’s talk about char. You might be coming from a language where a char is a single, lonely byte representing an ASCII character. Rust politely asks you to forget all that. In Rust, a char is not a byte; it’s a Unicode Scalar Value. This is a fancy way of saying it represents a single Unicode code point, and it takes up a full 4 bytes in memory. “Why on earth would you do that, Rust?” I hear you cry. It’s not for fun, I promise. It’s for correctness. By making a char 32 bits wide, Rust can guarantee that every char is a valid, self-contained Unicode value. This completely sidesteps the nightmare of trying to figure out if a byte is part of a UTF-8 sequence or not when you’re just trying to iterate over characters. It’s a bit like using a sledgehammer to crack a nut, but the nut is the entire history of broken text encoding, and the sledgehammer is beautifully designed.

5.5 bool: true and false

Right, let’s talk about bool. It seems laughably simple, doesn’t it? true or false. One bit. On or off. What could possibly go wrong? Well, my friend, welcome to the wonderfully absurd world of software, where we’ve managed to build entire careers on top of this single, solitary binary decision. At its heart, a boolean is the most fundamental unit of logic in your program. It’s the answer to every yes/no question you will ever ask your code: Is the user logged in? Should this element be visible? Did that horribly complex operation just succeed? It’s the bedrock of every if, while, and for statement you’ll ever write. Getting it right is non-negotiable.

5.4 Floating-Point Types: f32 and f64

Right, let’s talk about numbers that lie to you. Not maliciously, mind you, but out of sheer, fundamental, mathematical necessity. We’re entering the world of floating-point numbers, and if you think 0.1 + 0.2 == 0.3, prepare to have your entire reality gently, but firmly, corrected. We use them to represent a huge range of values, from the mass of a proton to the national debt, by allowing the decimal point to “float.” Rust gives you two main flavors: f32 (single-precision) and f64 (double-precision). Unless you’re working on a deeply embedded system where every byte counts, just use f64. It’s the default for a reason: it’s double the precision (about 15-16 decimal digits instead of 6-7 for f32) and on modern hardware, the performance difference is often negligible. The extra headache you save yourself from weird precision errors is worth it.

5.3 Integer Overflow: Panics in Debug, Wrapping in Release

Right, let’s talk about what happens when your integer math goes sideways. You’re probably thinking, “It’s just a number, how bad could it be?” Oh, my sweet summer child. In most languages, this is a silent, catastrophic failure. In Rust, it’s a conversation, and you get to choose how that conversation goes. The core design choice here is brilliant: in development, you want to know immediately when your assumptions are violated. In production, you might need a predictable, if technically wrong, outcome to keep the whole thing from crashing.

5.2 Integer Literals: Decimal, Hex, Octal, Binary, and Byte

Right, let’s talk about how you tell a computer, “Here, have a number.” It seems simple, but like most things in computing, we’ve devised a few different ways to do it, each with its own historical baggage and modern use case. I’m going to show you the whole cast of characters: decimal (our everyday numbers), hex, octal, binary, and the slightly oddball byte literal. Pay attention; this is where a lot of subtle, head-scratching bugs are born.

5.1 Integer Types: i8 Through i128, u8 Through u128, isize, usize

Alright, let’s talk about the building blocks of numbers in Rust. This isn’t your high school algebra class; we’re dealing with the raw, unforgiving metal of the machine. Rust forces you to be explicit about your numbers because it values correctness over convenience, a trade-off you’ll learn to love (or at least respect). The first thing to know is the great divide: signed and unsigned integers. A signed integer (i8, i16, i32, i64, i128) can be negative, zero, or positive. An unsigned integer (u8, u16, u32, u64, u128) can only be zero or positive. Think of it like a i for “I can be negative” and a u for “uh, only positive, please.” The number (8, 16, 32, etc.) is the size in bits. More bits means you can count higher (or lower, in the case of i types) at the cost of using more memory. This isn’t just pedantry; getting it wrong can lead to catastrophic bugs.

5. Data Types: Scalars and Compounds

6.6 Pattern Matching: LIKE, ILIKE, and SIMILAR TO

Alright, let’s talk about finding stuff. You’ve got a database full of text, and you need to pull out the rows where a column kind of matches what you’re looking for. Maybe you need everyone whose name starts with ‘Mc’, or all error codes that look like ‘404-%-something’. This is where we leave the clean, well-lit house of equality operators (=) and venture into the wild, sometimes messy, woods of pattern matching.

6.5 String Functions: length, substring, position, trim, upper, lower

Right, let’s talk about making your text do what you want. You’ve got strings in your database, and staring at them isn’t going to change them. You need tools. The functions we’re about to cover are your basic wrench and screwdriver set. They’re simple, they’re essential, and if you don’t know how to use them, you’re going to end up trying to hammer in a screw. I’ve seen it. It’s not pretty.

6.4 Database Encoding: UTF-8, SQL_ASCII, and Collations

Right, let’s talk about the alphabet soup of database text storage. You’ve probably already created a varchar column or two, but the real decisions—the ones that will either save your bacon or cause a multi-byte, accented-character-filled nightmare at 2 AM—happen at the encoding level. This isn’t academic; it’s the bedrock of storing anything beyond “Hello World.” Think of encoding as the dictionary your database uses to translate bits into letters. If you tell it to use an English dictionary (like LATIN1) and then try to store a Japanese character (こんにちは), it’s going to have a complete meltdown. The only sane choice for a modern application is UTF-8. It’s the one dictionary to rule them all, capable of storing pretty much every character from every human writing system, plus emojis. Yes, your database can store that pile of poop emoji (💩). You’re welcome.

6.3 char(n): Fixed-Length Strings and Padding Behavior

Now we arrive at the char(n) type, the most misunderstood and, frankly, most often misused character type in the SQL universe. Its defining characteristic is that it is fixed-length. When you define a column as char(10), every single value in that column is exactly 10 characters long. Always. No exceptions. The database engine enforces this with a zeal usually reserved for bouncers at an exclusive club. If you insert a string shorter than the defined length, say ‘cat’ into a char(10) column, PostgreSQL doesn’t just store those three characters. Oh no. It pads the string with spaces until it reaches the specified length. It becomes ‘cat ’ (with seven trailing spaces). When you retrieve that value, by default, the trailing spaces are stripped off, making it seem like the padding never happened. This is the source of much confusion and many a quiet bug.

6.2 varchar(n): Length-Limited Strings and When to Use Them

Alright, let’s talk about varchar(n). This is the data type you’ll use 99% of the time when you’re storing strings that aren’t the size of a small novel. Think of it as a string with a maximum length you get to define—the (n) part. It’s the “I know this field should be roughly this long, but I’m not a monster, I’ll let you use less space if you want” option. It’s the workhorse, and you need to understand its quirks, because they will bite you if you’re not careful.

6.1 text: The Unlimited Variable-Length Type

Right, let’s talk about text. This is the one you’ll use 95% of the time you need to store anything longer than a simple word or two. It’s PostgreSQL’s workhorse for storing strings of practically any length you can throw at it. Forget about pre-defining a maximum length; that’s the whole point. It’s a variable-length type, meaning it only takes up the space it needs (plus a tiny bit of overhead), which is exactly what you want most of the time.