Nonstandard Deviations

turning and turning in the widening gyre

Imagine we had a magical machine that takes in mathematical equations, does a $\frac{d}{dt}$ to them, and then spits it out:

$$x(t) \to \color{Blue}{\frac{d}{dt}} \to y(t)$$

You might be tempted to write $y(t) = x'(t)$ or maybe even $y = x'$ or $y(t) = \frac{d}{dt}(x(t))$, but the first thing you want to correct is that attitude 😉 We're not actually talking about x(t) or y(t) at all. We're talking about the machine, $\frac{d}{dt} [ {\color{Red}\star} ] $. That's our actual object of focus.

What can we do with this machine? Well as any good scientist would do, first we probably want to test a couple different inputs to see what just passes through the machine unharmed, like how water passes through a turbine unharmed.

$$x(t) = t \to \color{Blue}{\frac{d}{dt}} \to y(t) = \frac{d}{dt} [ t ] = 1$$

That's not it.

$$x(t) = t^n \to \color{Blue}{\frac{d}{dt}} \to y(t) = \frac{d}{dt} [ t^n ] = n t^{n-1}$$

That's not it either. Hm. This might be tougher than I thought.

$$x(t) = \alpha t^n + \beta t^m \to \color{Blue}{\frac{d}{dt}} \to y(t) = \frac{d}{dt} [ \alpha t^n + \beta t^m ] = n \alpha t^{n-1} + m \beta t^{m-1} $$

Okay, that's also not it — but that's kind of interesting. Somehow, this system's output scales up and down exactly with an input that scales up and down. It also preserves addition. It's what we would call linear.

Let's try something that ain't a polynomial.

$$x(t) = e^t \to \color{Blue}{\frac{d}{dt}} \to y(t) = e^t \color{Red}{= x(t)}$$

Success! $e^t$ passes through our machine unchanged.


Eigenfunctions are the tiniest bit more abstract than that. They are things you throw into the machine, and they pass out, unchanged except for the scaling. We do have to be a bit careful with that word, however.

Not really in the sense of letting $x(t) = e^t$ and then doing

$$2000 \cdot x(t) \to \color{Blue}{\frac{d}{dt}} \to y(t) = 2000e^t \color{Red}{= 2000 \cdot x(t)}$$

because we actually get that through linearity anyway, no need for a second special German name for it.

But suppose instead we did $x(t) = e^{5t}$. In that case,

$$x(t) = e^{5t} \to \color{Blue}{\frac{d}{dt}} \to y(t) = 5 \cdot e^{5t} \color{Red}{= 5 \cdot x(t)}$$

which is something we don't get “for free” from linearity. You can't do that with $t^{5n}$; your output is gonna be something times $t^{5n-1}$, which is a very different thing than $t^{5n}$!


Okay, big whoop. $e^{st}$ passes through the machine and is just $s \cdot e^{st}$ after, itself scaled up or down. Why do we care?

See if you can pick up what I'm putting down here:

  • We have this pretty simple little thing that passes through the machine unharmed.
  • The machine is kind of hard to predict for completely arbitrary things – not things like $t^5$ or even $sin(cos(tan(t)))$, although I couldn't figure out the second one anyway, but maybe even randomly changing functions. Think like, an earthquake sweep, or a noisy reading from a digital thermometer.
  • But wait – if we can find a way to take things apart, and reconstruct them as some combination of the simple little thing, we might be able to figure out what the machine will do to a lot more stuff than we would otherwise.

It turns out that we can in fact do that. You can actually take apart a whole bunch of different signals, reconstruct them as $e^{\alpha t} + e^{\beta t} + \dots$ , run those through the machine, and actually be able to predict the output, because all you're doing is running a bunch of eigenfunctions, (things which pass through unchanged except for the scaling), through a machine, that preserves additivity.

I'm sweeping a bunch of stuff under the rug here – things like convolutions, and Fourier transforms, and z-transforms and stuff – but that's the basic jist. One really interesting little catch is that in order to really represent everything we want to, we need to let $\alpha, \beta, etc \in \mathbb{C}$. That is to say, we need to let them be complex numbers.

But they'll teach you that in class. Or, rather, they'll talk about it in class, and then you'll figure it out when you do some homework problems. Good luck! ♥

Let me tell you about the fast track.

The fast track begins before you're born. It's already decided for you by where your parents decide to put down their roots. It's decided for you by which one of the 40,000 high schools in America you're eventually going to attend.

Most of these 40,000 send 0 children to a fast track university. A large handful send 1. A few send multiple.

And some very scant few send dozens.

These are called feeder schools. My high school was a feeder to Harvard. We send over 20 seniors there when I graduated.

I wasn't one of them.

One thing you'll notice: For most people, the fast track is never easy. But once you're on it, it's considerably easier to stay on it. It's much, much harder to get on it.

To get into my high school, you needed to have top grades and beat x% of kids on a standardized test, where x > 80, if memory serves. But you did not get tested every day and replaced once your average from the last week went down.

That better-than-x% thing is called a percentile. Percentiles are cool because they give a quick-and-dirty way of comparing effort levels among similar things. Beating 80% of the other kids on a test you've never heard of is a lot like beating 80% of the other kids on the SATs when you did take them — so many years ago — a score of about 1250 / 1600, or 1870 / 2400 if you took it during its wild-and-crazy years. Some of you are saying “Really? That's it?” Some of you are saying “Really? That high?” Yes, that's it, yes, that high.

I fell off the fast track. I dropped out of high school. I performed a minor miracle: I got back on it.

I'm not sure if that was the right move, in the end.

Inspired by an old Reddit post on using flashcards to learn programming, which I think is a very bright idea implemented in a way nobody else but him and I would ever do.


I love flashcards. I love apps like Anki for timing its flashcards so that I see them less and less as they sink into long-term memory, a la the spacing effect. And I love sites like Quizlet for doing the opposite, and providing me the convienience to cram flashcards on whatever dumb little language feature is stuck in my craw this time. (Believe me, there's no shortage of those.)

You know what I hate? Learning new software libraries. No, not the libraries themselves – just the process of grinding your way through them. For me, at least, learning a new library more often than not looks like this:

  1. Spend too much time trying to code something buggy and error-prone, and decide that somewhere, someone must have already done a better job than me.
  2. Google around and pick the best of maybe 3 or 4 library options based on surface characteristics, while choking down the fear that maybe I didn't choose the best one for my needs. That maybe fate itself wants me to suffer.
  3. Get it installed, and start flipping through the documentation for something vaguely related to what I want to do.
  4. Grab the first example that crosses my vision that I can remotely think of a way to solve my problem with.

And then I make a small amount of headway on the problem, and then I return to the documentation, and make a small amount of more headway, and then I return ot the documentation, and ... realize there was a function that made this whole problem so much easier to handle that I didn't see on the first time around.

Maybe instead of awkwardly throwing around string splits and findlasts, I find its older, more general cousin, findnext. Maybe instead of iterating through all the nodes of a graph to count them up, I just find |nodes(graph)|. Maybe these two examples feel so simple, you can't believe anyone would ever be so impatient as to try to work like that, right up until the next time you're up at 3 in the morning desperately crunching through your CS assignment due the next day and all you want to do is go to sleep and put this coding nightmare behind you and at least the mental energy of rolling your own tiny solutions can keep you going whereas reading more than 6.257 words of documentation will 1,000% put you to sleep and make you miss your deadline.

When I grab a library, I'm usually already pretty impatient. I don't want to have to book out an hour to read the documentation. But neither do I want to skim through documentation of a tool I'll most likely have to use again in the future anyway, and leave without understanding half of the baked-in power the library has to offer.

If only there were a better way.

A better way

Let's say we have a library that defines 20 new functions.

The documentation has the following information:

  • The type signature of the function.
  • A short description of what the function does.
  • A few basic examples of code implementing that function, combined with return values.
  • Maybe a few more advanced examples if the function has any subtle gotchas with elements of the base language, or other functions in the library.

You can make 1 flashcard per function where you get shown the description, and you're asked for the name of the function that does what's described. The answer side can show the full type signature.

Then, you can take 2 or 3 of the basic examples, and make that many flashcards where you get shown the code with one of the return values omitted, and you're asked for the return value. The answer side can show the code with the return values filled back in.

If you use a Quizlet-like flashcard system which only allows you to hold a few cards in your hand at once, and you run through these autogenerated cards, you can first go through the new cards to see what's on the other side – then, when they come around again, try to actually answer them based on what you saw 30 seconds ago. (You can mark it right if you got the gist of the thing; we're not looking for memorization here, we're just looking to build familiarity with everything at our disposal.) This works a lot better than you would think, especially if you do it right before you actually dive into using the library.

In my experience, 80 flashcards doesn't actually take all that long when you are only asking yourself to remember one minimal nibble of information per card, even if it's your first time seeing them. The active mental effort you need to expend looking at the various cards helps keep you actually focused on the task at hand, which is much better than reading through documentation, at least for me. And at the end, you'll have effectively gone through and given your brain a succinct overview of the basic capabilities of all 20 functions in the library. When you run across a problem where the author intended a nice, idiomatic-sounding approach, you'll be far less likely to pass it up just because you didn't see it.

TDD works, at least a little bit. It's also very simple:

  1. Write tests the code will have to pass.
  2. Write the code.
  3. Run the tests.
  4. If the tests succeed, proceed.
  5. If not, debug and repeat. (Usually the code is what needs to be debugged; sometimes the tests need to be debugged; most rarely, but most valuably, you got both parts subtly wrong.)

The designers of Julia are seasoned devs. That's why they included a well-documented Test package right there in the standard library. But let's see if we can skip the documentation and just give you some tools right now to make your life easier.

@test is very simple: If you get back @test true when all is said and done, you get a Test Passed. If not, you get an error thrown your way.

julia> using Test

julia> # No need to add the Test package; it's in the standard library.

julia> @test 1 == 1
Test Passed

julia> @test 1 == 2
Test Failed at none:1
  Expression: 1 == 2
   Evaluated: 1 == 2
ERROR: There was an error during testing

Pretty straightforward. Let's try something a little spicier, like variable arguments and string concatenation.

julia> using Test

julia> f(x...) = x[1] * x[2]
f (generic function with 1 method)

julia> strings = ("Fire", "Water", "Air", "Earth")
("Fire", "Water", "Air", "Earth")

julia> @test f(strings) == strings[1] * strings[2]
Error During Test at none:1
  Test threw exception
  Expression: f(strings) == strings[1] * strings[2]
  BoundsError: attempt to access (("Fire", "Water", "Air", "Earth"),)
    at index [2]
  ###### (blah blah blah blah blah blah ..........) #######

ERROR: There was an error during testing

Oops. Looks like I did something wrong. But since I tested it now, I know that I did something wrong. If someone accidentally changes it down the line, and the @test fails again 6 months from now, we can trace it back quickly to this exact bit of code. So let's revise and try again.

What could've gone wrong? Ah. I see. We should have just passed our tuple elements as arguments, since f(x...) wraps them in a tuple anyway. Let's try again.

julia> using Test

julia> f(x...) = x[1] * x[2]
f (generic function with 1 method)

julia> @test f("Fire", "Water", "Air", "Earth") == "FireWater"
Test Passed

Beautiful. Now let's go back and figure out how we should actually put in a tuple, if we wanted to for a laugh. I seem to recall that ... works by “unfolding” tuples in an actual function call.

julia> strings = ("Mind", "Body", "Light", "Sound")

("Mind", "Body", "Light", "Sound")

julia> @test f(strings...) == strings[1] * strings[2]
Test Passed

And hey! What if we wanted to make sure the same error we made always gets thrown? After all, someone might change the code down the line in a subtle way that makes our ordinary @tests still pass, but breaks some try.... catch... finally code somewhere else. That could be a serious pain.

Introducing @test_throws. You'll notice that above, we had a BoundsError, because we tried to index into the 2nd place of a 1-tuple that contained the 4-tuple we actually cared above. So:

julia> using Test

julia> f(x...) = x[1] * x[2]
f (generic function with 1 method)

julia> @test_throws BoundsError f(("Mind", "Body", "Light", "Sound"))
Test Passed
      Thrown: BoundsError


This is a follow up to my investigation into Julia's helpful, but slightly idiosyncratic, ways of condensing down tuples.

Having evocative language for things is helpful. So I'm going to introduce the terms I've used since writing that post mentally to describe nested empty tuples and the ways they get condensed down.

  • A () is a coin, same as in Haskell.
  • A ( (), ) is a coin purse, because it holds coins.
  • A ( ( (), ), ) is a bank, because it holds coin purses.

When Julia does the syntactic-sugar thing where ((())) == (()) == (), we call that robbed. So:

  • A ((())) typed at the REPL is a robbed bank,
  • A (()) typed at the REPL is a robbed coin purse.

I'll leave it to your imaginations to generalize these further. 😉

One thing I love about Julia is that it uses true vectors, not “n-by-1” arrays that look kinda-sorta like vectors at a distance if you squint.[^1]

You can see this immediately from the type of a vcat call The {Int64,1} reveals the gospel as according to Jules:

julia> vcat(1,2,3,4,5,6,7,8,9)
9-element Array{Int64,1}:

(Credit where credit is due – whoever coded the when big matrices print, great work. It's just lovely. ♥)

Not to worry, though; if you prefer to think in terms of rows rather than columns, and are willing to accept the very slight performance hit, you can also work with hcats:

julia> hcat(1,2,3,4,5,6,7,8,9)
1×9 Array{Int64,2}:
 1  2  3  4  5  6  7  8  9

It's a little less idiomatic, but I promise I won't tell the Pope you're writing your verse in vulgar Latin if you promise not to tell him I've been cutting our holy water with regular well water. What I find disturbing, however, is this:

julia> hvcat(3,          1,2,3,    4,5,6,     7,8,9)
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

hvcat fills in its values row-first. Doesn't this seem odd for a language that gives a (slight) edge to column vectors over row vectors?

Well, actually, no, it doesn't. First off, hvcat starts with an h, not a v. We would expect it to delineate something row-first. More importantly, pretty much everything who's taken linear algebra thinks in “row-by-column” form. And when you want to look into a multidimensional matrix, that's exactly how you should do it:

julia> telephone = hvcat(3,    1,2,3,     4,5,6,     7,8,9)
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> getindex(telephone, 2, 3)

julia> # Second row, third column.

But then why is it that when we use only a single index, we still go column-first and not row-first?

julia> telephone = hvcat(3,    1,2,3,     4,5,6,     7,8,9)
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> getindex(telephone, 3) # Third down.

julia> getindex(telephone, 7) 

Arrrrrrrrrrgh. If you're going to make the 2-index form go row-by-column, then you should make the 1-index form iterate through the rows, not the columns. And I know this sounds like such a minor issue, but these kinds of issues of conceptual integrity are important to me, because they're exactly the kinds of subtly counter-intuitive things that developers will make a mistake on once a year, maybe, but then spend half a day trying to find and debug it. I mean, a change like this doesn't even affect the iteration of n-by-1 and 1-by-m arrays – it just makes dealing with multidimensional arrays more clear.


[^1] (Okay, that was mean. No offense to my pure math friends — I just don't like losing sleep over wondering if I'm losing computing cycles over things that the computer thinks are 2D, but really aren't.)

A lot of computer-savvy folks find it crazy that people don't care about privacy enough to do something as simple as install an AdBlock or switch their browser from Google to DuckDuckGo. And a large subset of these folk find it even crazier that they themselves don't do these simple things.

But in my experience, people care a lot about privacy – just not the kind of privacy these applications provide.

  • When you see people use Snapchat because they have faith that their messages will disappear, that counts as a privacy concern.
  • When you see people say “Can we do this over the phone?” because part of them worries the difficult conversation that will ensue might catch them in poor lighting and they don't want text screenshots floating around their circles, that counts as a privacy concern.
  • When you see people run alternate Facebook accounts so that they can post memes without their boss seeing, that counts as a privacy concern.

Of course, these kinds of things generalize:

  • When you see people prefer platforms with disappearing messages and features that notify you when the other person has screenshotted their phone, that counts as a privacy concern.
  • When you see people prefer audio and video formats to text formats because audio and video don't come with a baked-in Ctrl+C, Ctrl+V command, that counts as a privacy concern.
  • When you see people use alts because they don't want some parts of their social group to see their messages, that counts as a privacy concern.

In my experience, these kinds of privacy concerns, which I'll call near privacy[^459d], don't get brought up in cybersecurity circles in proportion to how much the average end user actually worries about them.

And with good reason – they feel different to grapple with than questions about SHA-256 and Heartbleed. They're not so easily reified into the world of mathematics, where bright minds can tinker with them in an unreasonably effective way. To me, they feel much more like design problems, all focused around the central theme of “How do we make sure this person can speak their mind, without fear of it being brought up against them later?”

Ironically, I don't really care about these features myself. The privacy part of my privacy-first blogging platform is a distant third to me after its cheap monthly cost and its minimal design. But I do think it's helpful to have this conceptual handle in your mental arsenal.


[^459d]: The name for this comes from picturing social connections as a connected graph. Most of us are, on average, friends-of-friends-of-friends-of-friends-of-friends of whatever J. Random Hacker we imagine wants to steal our data in the abstract. You can consider that far privacy, because you're more than, oh, about 3 degrees removed from the person in question. Near privacy is about protecting sensitive details from friends, and to a lesser extent friends-of-friends, and sometimes even friends-of-friends-of-friends.

A tuple is an immutable data type in Julia. (1, 2, 3) is a tuple, and its type is Tuple{Int64,Int64,Int64}. You can't change it by doing something nifty like (1,2,3)[1] = "fish", although you can make a brand new ("fish", 2, 3) from whole cloth if you wanted to.

() is also a tuple, although it's an empty tuple — the Haskell community sometimes calls this a coin. Its type is (wait for it) Tuple{}.

So what do you think the type of (()) would be? Tuple{Tuple{}}? Well...

julia> ()

julia> (())

julia> ((()))

julia> (((())))

julia> typeof( (()) )

Not exactly. Julia condenses down empty tuples, unless you twist its arm a bit to give you what you actually typed:

julia> ()

julia> ((),)

julia> (((),))

julia> (((),),)

julia> typeof( ((),) )

This would mostly be fine, except that tuples are intimately tied to function arguments with the f(x...) syntax. The tuples with this syntax are not condensed by default in the same way:

julia> f(x) = x
f (generic function with 1 method)

julia> f( () )

julia> f( (()) )

julia> g(x...) = x
g (generic function with 1 method)

julia> g( () )

julia> g( (()) )

So why don't functions condense empty tuples the same way that (()) does? Isn't that a bit unintuitive?

Well, slightly, yes. But when we're typing out nested tuples, it's easy to force them if we have to – we just do ((),). If we wanted our f(x...) = x to return () instead of ((),) when we pass it with f( () ), we would need to also provide some way to force f to not condense it when we have to.

That seems like a lot of work, for very little gain – in fact, net negative gain, because I'm pretty sure the average Julia coder isn't anal enough to be messing around with empty tuples all that often! 😂

Enter your email to subscribe to updates.