Reimagining Matrices (2012)

32 points by g0xA52A2A 2 years ago

"Error establishing a database connection"

https://web.archive.org/web/20230205025754/http://conal.net/...

layer8 2 years ago

A glitch in the matrix.

Koshkin 2 years ago

More specifically, this is about “reimagining matrices in the context of a pure functional language.” To be honest, as much as I appreciate the functional paradigm, I have been haunted by the impression that trying to apply it to some things makes them unnecessary complicated. Speaking of matrices, their “first principles” come from solving systems of linear equations; the (abstract) linear algebra reduces their role to being mere representations of linear maps in a given basis, which makes matrices look like some sort of technicality and thus hides their intrinsic beauty, complexity, and huge importance in their own right.

ndriscoll 2 years ago

How are matrices important in their own right besides as representations of linear maps (taking that to mean module homomorphisms)?
Funny enough a couple years back I implemented matrices as representations of linear maps for fun (I think I fixed the dimension to 3. I don't remember whether being generic on dimension makes it more complicated), and while that's not going to win any awards for performance, I found that it's the exact opposite of unnecessarily complicated: a "matrix" is just a way to "print"/serialize a function as an array of its values on a basis, addition is literally `f + g = x -> f(x) + g(x)` and multiplication is literally `f * g = x -> f(g(x))`. The code almost writes itself.
- gjm11 2 years ago
  
  Don't forget duality. A matrix can represent a linear map (or, if you prefer, a module homomorphism) but if you think of it as a map from V to W* instead of to W, it represents instead a bilinear form on VxW.
  And of course an array doesn't need to be a matrix, and there are many many purposes for which a rectangular array of numbers is a useful thing to have. (I don't know whether those purposes are relevant to the OP, because at the moment what the OP says is "Error establishing a database connection".)
  There are also kinda-intermediate situations. Consider linear least-squares regression. You have a bunch of things that at first sight are mere arrays. Let's say you're trying to predict house prices on the basis of floor areas, bedroom counts, and average income in the area, so you have a big array: one row per house, one column for each of those quantities. Call this A. You reckon the house price might be well predicted by some sort of low-order polynomial thing, so you build another array, also one row per house, with columns containing things like "floor area times average income squared". Call this one D. And you know the actual prices of the houses, which you put in a column vector, again one row per house. Call this v. On the face of it, none of these things looks much like a linear map, but it turns out that linear least-squares regression is just linear algebra and e.g. the regression coefficients are computed as (Dt D)^-1 Dt v where Dt is the transpose of D (though in practice you shouldn't just literally do that calculation).
  (Of course you can interpret these things as linear maps on vector spaces. That's not a bad way to organize the proof that the regression coefficients are what I said they are. But e.g. I don't think there's any particular significance to the linear map represented by Dt.D, or to the vector Dt.v .)
  
  ndriscoll 2 years ago
  
  > A matrix can represent a linear map (or, if you prefer, a module homomorphism) but if you think of it as a map from V to W* instead of to W, it represents instead a bilinear form on VxW.
  If you mean that it's also a bunch of other things like an element of V* tensor W or a linear form on V tensor W*, and none of these are the most "natural"/better than the others, then I guess that's true. Matrices still seem like the least natural to me though.
  For your least squares example, on the face of it it's not linear until you choose to use a linear least squares model :)
  It's been a decade or so since I've thought much about this stuff, so I'm not highly confident in this answer, but I think the reasoning goes something like this:
  You're trying to solve f(x)=y, but the problem is y is outside of the image of f. The next best thing is to say you at least want <f(x'),f(x)> = <f(x'),y> for all x'. i.e. f(x)=y up to dot products with all possible test vectors in the image of f, which forces their projections onto im(f) to be equal. Then <x',ft(f(x))> = <x',ft(y)> for all x', so ft(f(x))=ft(y), or x = (ft f)^-1 ft y.
  It's not obvious to me that this minimizes the square norm of f(x)-y, (or that (ft f) is necessarily invertible) but I guess the point is that that the only nonzero part is in the orthogonal complement of im(f), so you can't do any better.
  
  messe 2 years ago
  
  > Don't forget duality. A matrix can represent a linear map (or, if you prefer, a module homomorphism) but if you think of it as a map from V to W* instead of to W, it represents instead a bilinear form on VxW.
  Okay, but why does that need a matrix to conceptualize? Given any linear map L: V → W, and another map φ: W → W* (which is equivalent to assuming a basis for W*, i.e. what you're doing when treating L as a map from V to W* instead of to W), then you can construct a bilinear form 〈v,w〉 = L(v)φ(w).
  
  gjm11 2 years ago
  
  You don't need matrices to conceptualize these things, you need them to represent them for purposes of calculation. If you want to do calculations on linear maps V -> W, you usually do it with matrices. And if you want to do calculations on bilinear forms on VxW, you also usually do it with matrices.
- incrudible 2 years ago
  
  Making it too slow for production use is one hell of a complication.
  
  ndriscoll 2 years ago
  
  Simple, obviously correct implementations that mirror the problem definition are super useful for tests though! Write both, and then you can property test that the two implementations produce the same outputs on random inputs.
- ithinkso 2 years ago
  
  > How are matrices important in their own right besides as representations of linear maps
  they can also represent groups
  
  naasking 2 years ago
  
  So then there must be some isomorphism between linear maps and groups?

fatneckbeard 2 years ago

interesting that matrices are a kind of data structure in mathematics that has so many 'invalid' operations.

in the integers basically the only thing you can do to break arithmetic is 'divide by 0'

in matrices there are a lot more ways to do something resulting in an error.

there are ways to reimagine integers so that divide by zero is not an error. they are esoteric but they exist.

i wonder if they have done this for the matrix?

chongli 2 years ago

I think the key here is that matrices are a "higher-kinded" data structure. That is, they're parameterized not only by the shape (number of rows and columns) but also by the type of the elements themselves. Furthermore, the elements of a matrix need not be plain numbers (such as plain integers or floats) but can themselves be elements of any sort of ring, such as polynomials or even elements of a quotient ring.
Now, you may think a matrix whose elements are polynomials with coefficients that are integers modulo some prime p with the polynomial ring itself quotiented by some irreducible polynomial might be too bizarre and esoteric to be useful but in fact these structures do see a lot of use in cryptography.
ly3xqhl8g9 2 years ago

"for matrix multiplication, even in relatively simple cases, every step can present more than 10^12 options." [1] Not sure if valid, but it's worth considering: what if we don't have the best matmul algorithm yet.
[1] https://www.quantamagazine.org/ai-reveals-new-possibilities-...
Yoric 2 years ago

In mathematics, operations are basically sets. The fact that `n/0` is not defined... doesn't really make 0 or division special, so I don't suppose that anybody has seriously worked on that.
It's very different in programming, of course.
- untoxicness 2 years ago
  
  This is correct as long as you restrict your point of view only to sets. Even in pure mathematics one is usually interested in sets with additional structures. For instance, even though one speaks of the "set of real numbers" one usually implies that there is extra structure. In particular one usually requires that this set has operations "addition" and "multiplication" that make this set a ring. Then zero is the neutral element of addition and the annihilator of multiplication, so very special indeed!
  
  Yoric 2 years ago
  
  Operations are just sets :)
- resource0x 2 years ago
  
  Seriously worked on what?
  
  Yoric 2 years ago
  
  On what the GP was asking, i.e. giving a meaning to matrix operations that are not defined.
mmdski 2 years ago

See the pseudoinverse: https://en.wikipedia.org/wiki/Moore–Penrose_inverse.
Koshkin 2 years ago

> basically the only thing
There is also the overflow which is even more nasty than the zero-divide, because often it goes unnoticed.
- mjhay 2 years ago
  
  One of the many problems you get from assuming that floating point "numbers" are a field (or ring for ints), when they aren't even associative, let alone commutative. I hope in 100 years we have a practical representation for the reals without all these slings and arrows. Floats are more-or-less the best minimal message length encoding if you have no prior information, but I think there could be better encodings for most purposes, especially to prohibit many cases of numerical issues.
  
  dahart 2 years ago
  
  Interesting thought! Does a rational representation do any better by field standards? https://iquilezles.org/articles/floatingbar/
  What would it take to have a practical representation that’s better? I’m not well versed in the fields and rings theories in math, but it seems like the fundamental problem here is that floats have a finite number of bits, which means they are forced to round the results. That rounding is what causes a very large portion of all the trouble, it is what breaks associativity & commutativity. Since there is no practical way to have an unlimited number of bits, I wonder if a practical field of floats cannot exist in general?
  
  mjhay 2 years ago
  
  I know rational representations (with bignums) can work better for some purposes (albeit not high-perf numerical ones), but the rub is, like you say, you're trying to represent an uncountably infinite set with finite memory, so everything is going to be some kind of space/time/fidelity tradeoff.
  The only concrete finite fields are the finite fields of size p^k, where p is prime. I'm kind of thinking that there must be some representations that at least can shove off the breaking of associativity, commutativity, etc more into the background. It's probably more of a computer science and usability problem, but hopefully backed up by enough theory to provide some guarantees.
  You can make a "free" field as a programming interface (like in the OP) - and the floats would then be a leaky implementation of that interface. I think the question is how to make an implementation that better captures what we talk about when we talk about real numbers, while having good enough performance for big numerical tasks, and avoiding common numerical problems. Definitely not easy, and I think application dependent to a great extent - a place where one really has to be able to glide up and down levels of abstraction.
  
  bodhiandphysics 2 years ago
  
  The thing about floating point numbers is that while they're not associative... they are approximately associative (mod inf and nan). In other words (ab)c = a(bc) + e where e is is a number much smaller than max(a,b,c). This means that we can measure quite easily how bad an operation is in floating points, using what's called the condition number, and there are many many good ways of either estimating or bounding the condition number. So floating point numbers are easy to analyze, though they require a bit more machinery than most programmers get in intro to computer science. Really floats are the right way to do calculations involving reals on the computer. Of course, if you're not actually using reals, you shouldn't use them!
  
  mjhay 2 years ago
  
  Yeah the condition number helps in simple cases (or when solving single linear systems), but things rapidly spiral once you start getting into solving stiff ODEs, etc - to the point that only numerical specialists can judge it. I feel like there ought to be ways of dealing with these issues in a more fine-grained manner with better guardrails - something like a numeric-aware typesystem or automatic algorithmic scheme generator. Like I said though, maybe in 100 years.
  
  bodhiandphysics 2 years ago
  
  stiffness has nothing to do with fp numbers... stiff equations are just as hard to solve using arbitrary precision bigints since the issue is the limits of approximations of the integral.
  
  mjhay 2 years ago
  
  That's getting back to my point though - the integral is over measurable sets of reals, and we are approximating that with finite-difference floats. This is a complex problem with many guises.
  
  fatneckbeard 2 years ago
  
  there is no practical representation of the reals because their theory is based on infinite numbers and computers are finite.
  aka you are never going to store sqrt 2 in a register.
  the closest you can get is symbolic algebra packages. but at the end you still have to come down from infinite land and actually have an ASCII decimal representation because you cannot cut a board at sqrt2 feet long or fill your car with sqrt2 gallons of gasoline. or pay pi dollars for a pizza and expect your bank to keep a list of all the different transcendental atomic values you have used for payments over the years. "Your account has 17 pi plus 45 sqrt2 plus 97 e minus the golden ratio" is never going to work very well.
  one thing that i think would help is to put non-real mathematics into the mathematics curriculum and stop handwaving about how a computer does real numbers.. (it doesnt). teach discrete math, modular arithmetic, prime number theory, at the same time as pre-calculus.

naasking 2 years ago

You can peruse the talk slides while the blog is inaccessible:

https://github.com/conal/talk-2012-reimagining-matrices

atan2 2 years ago

Error estabilishing database connection. :(

LargoLasskhyfv 2 years ago

Tears in the Matrix...