DOS APPEND

119 points by SeenNotHeard 10 months ago

TeMPOraL 10 months ago

> APPEND is one of the things that are completely irrelevant 99.99% of the time… yet can be extremely useful when the need arises.

Is it really that irrelevant? I mean, if you look past the specifics (directories, interrupts, DOS versions), this seems to be implementing the idea of bringing something into scope, in particular bringing it into scope from the outside, to modify the behavior of the consumer (here, assembler) without modifying the consumer itself. Today, we'd do the equivalent with `ln -sr ../../inc ../inc`.

I'd argue the general idea remains very important and useful today, though it's definitely not obvious looking back from the future what this was what APPEND was going for.

theamk 10 months ago

Yes, general idea is still very important and useful, but this is post about specific command called "APPEND" in MS-DOS environment. Also, it's not an equivalent of "ln -sr" as "ln" replaces targets and not stacks them. The proper modern equivalents are environment variables like LD_LIBRARY_PATH, PYTHONPATH, PKG_CONFIG_PATH, etc... and overlayfs mounts for a generic case.
But back to the APPEND: in all my time working with MS-DOS, I don't remember ever needing that, so it was 100% irrelevant to me. But this could be because I've worked with more "modern" programs (like Turb Pascal 5) which had good support for directories.

anyfoo 10 months ago

Always a joy when os2museum updates.

I, too, remember the trifecta of APPEND, JOIN, and SUBST. And while I always thought they were interesting, I was also wondering for most of them when I would ever use that. At the time, DOS versions and hence applications for it that don’t know subdirectories didn’t cross my mind, as my first DOS version was 2.11, I think.

Joe_Cool 10 months ago

When I got my fancy 1.6 GB harddisk I used `subst R: .\cd` to run my games needing the CD in the drive from a directory on the harddrive instead. Boy did load times improve a ton.
- pavlov 10 months ago
  
  CD-ROMs were extremely slow from a modern point of view. The original 1x speed was 150 KB/s, or 1.2 Mbps.
  That’s like trying to stream over a 3G mobile network with substantial packet loss, except it’s physically inside your computer.
  
  bluedino 10 months ago
  
  Plus 400-500ms seek times and a driver that might suck up all your precious CPU when reading data
mikaraento 10 months ago

SUBST was commonly used for source code directories. It served at least two purposes: making source appear at the same location on all machines and working around path length limits.
I've also used SUBST when reorganizing network drive mappings.

wruza 10 months ago

My favorite program in DOS was smartdrv.exe. I know it’s much more late addition, but it was a game-changer for these slow hard drives. Even a tiny cache size (I believe I tried kilobytes range) sped up things like 10-20x.

Even windows 3.x and 95 (surpisingly) ran faster with smartdrv preloaded. 95’s default cache for some reason was worse than smartdrv and literally produced harder sounds on my hdds.

The second favorite was a TSR NG viewer, can’t remember the name.

skissane 10 months ago

> The second favorite was a TSR NG viewer, can’t remember the name.
I know what a TSR is, but what's an "NG viewer"?
- wruza 10 months ago
  
  Norton Guides, iirc. E.g. Ralf Brown Interrupt List was available in it. Reading the docs without leaving an editor made programming much easier.
  
  anyfoo 10 months ago
  
  I sometimes wistfully look back to the days where I had a bunch of books and printouts open on my desk for programming. Of course, that's more than likely romanticizing things quite a lot...
  
  kelnos 10 months ago
  
  I remember in middle school I had the giant Visual Basic 3 book, and it was... amazing, but looking back, super annoying to deal with, always having to lug it around, dig through the table of contents and index to find what I was looking for, etc.
  So much easier now to just type a query into a search engine.
anyfoo 10 months ago

> and 95 (surpisingly) ran faster with smartdrv preloaded. 95’s default cache for some reason was worse than smartdrv
My (weak) guess is that you "32 bit disk access" and "32 bit file access" wasn't active then, i.e. Windows 95 did not use its native 32 bit disk drivers, but 16 bit BIOS access. I have a hard time seeing how Smartdrv could have done anything otherwise, unless it really had some tight integration with 32 bit Windows (which would be very surprising, it's a 16 bit TSR first and foremost).
But yeah, overall, I agree, it's surprising what even a tiny cache does. If you just manage to eliminate the seek times to the root directory for pretty much every single access to a new file, that can probably do a lot already. Especially in the days before "native command queuing", where the HD would try to reorder disk accesses to find a better (physical!) head seek path across the platter. Later HDs had some cache on board.
hulitu 10 months ago

> My favorite program in DOS was smartdrv.exe.
The downside was, that in case of a crash, you will use some data.
- wruza 10 months ago
  
  I was aware of that, but somehow it never happened, maybe got lucky. Smartdrv itself never crashed, and neither program crashes nor electricity issues affected it. Ofc I always used ctrl-alt-del before turning the pc off.
  It’s interesting what others experienced, cause I remember nothing about that in fido echoes (someone should have lost some data eventually). Was smartdrv even that different from today’s caching algorithms on all OSes?
- Joe_Cool 10 months ago
  
  If you had stuff that often crashed there was a write-through mode so it only did read caching. In one of the later versions it also trapped CTRL-ALT-DEL and did a sync before rebooting. It also did a sync each time command.com displayed a prompt. Really some neat coding for its time.

SunlitCat 10 months ago

Another handy dos command, originating back to DOS is SUBST.

Came in pretty handy when I wanted to share a folder with Remote Desktop, but it would only let me select whole drives.

Made a SUBST drive letter for that folder, worked like a charm!

Dwedit 10 months ago

SUBST makes use of NT Object Namespace Symbolic Links to register the drive letter. After running SUBST, you get an object named "M:" (or whatever your drive letter is") sitting in the "\??\" directory, its full path will be "\??\M:". It will be a symbolic link that points to something like "\??\C:\target_path".
You can either see this by using "NtQuerySymbolicLinkObject" on "\??\M:", or calling "QueryDosDeviceW" on "M:". On Windows NT, you will see the result as an NT-native path "\??\C:\target_path" rather than a Win32-style path "C:\target_path".
"\??\" is not some kind of special notation for paths or anything, it is a real NT object that exists. It holds your drive letters and other things.
On Windows 9x, you won't see an NT-native path from QueryDosDevice, you'll instead see a Win32-style path "C:\target_path".
Weirdly enough, Sysinternals Winobj is unable to find the symbolic link object at all, despite that it exists when you query it using NT API calls.
Fun fact about NT native paths, you can use them in Win32 if you prefix them with "\\?\GLOBALROOT". So "\??\C:\" becomes "\\?\GLOBALROOT\??\C:\". You can use it in any Win32 program that doesn't actively block that kind of path (such as the file explorer/open dialog)
- abdulhaq 10 months ago
  
  NT wasn't even a twinkle in Dave Cutler's eye when DOS APPEND came into being
bombcar 10 months ago

IIRC originally SUBST was designed for that - early programs didn't understand directories but did understand drives, and so you could make a directory appear to be a drive and they'd be happy - otherwise they'd dump everything in the root of C:\ (or A:\).
mycall 10 months ago

I still use SUBST with my team so we all have our source code on P:\ which can be mapped to wherever they want it to be. This helps keep Visual Studio object files and project includes pointing to the same place, especially when mistakes are made (they should be relative paths but things happen).
It is run from a registry key upon bootup.
- magicalhippo 10 months ago
  
  We do the same, except you don't need to do it at bootup, you can set it once using the following:
  Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\DOS Devices] "P:"="\\??\\C:\\Dev\\Source"
  Change source path accordingly, save as .reg file, import once and it'll stay.
  Nice thing about this vs using SUBST is that the SUBST is for your user only, so if you have a separate admin account it won't see it. However the above registry entry is for the machine, so all users on the machine will see it.
  Obviously makes it less useful for terminal servers and other shared machines.
  
  Kwpolska 10 months ago
  
  I think SUBST can break when you run as administrator (elevating your own privileges).
  
  jasomill 10 months ago
  
  It still works, but the elevated context maintains a separate list of drive letter mappings, so you need to issue the SUBST command again while elevated.
  The same applies to network drive letter mappings.
  Under the hood, both are implemented as NT Object Manager symbolic links, which you can see using, e.g.,
  https://learn.microsoft.com/en-us/sysinternals/downloads/win...
- Kwpolska 10 months ago
  
  SUBST is all fine, up until the point some tool explodes when it sees that normalizePath("P:\\whatever") == "C:\\code\\whatever", and it ends up with two paths to one file, or no way to build a relative path. I’ve seen that happen with some node tooling, for example.
  
  Dwedit 10 months ago
  
  I've seen programs blow up because they think they need to "canonicalize" the path, when all they actually want is a full path (a rooted path rather than a relative path).
  Canonicalizing the path resolves all drive letters and symbolic links. The Rust function that canonicalizes a path can fail on some mounted filesystems if the drive letter wasn't registered using the Windows Mount Manager.
  
  tom_ 10 months ago
  
  I think some of the WSL stuff refuses to deal with SUBST'd drives. And if you use voidtools's Everything, it's worth spending 2 minutes setting up some excluded paths so that you don't get doubled up entries. But it seems it does generally work pretty well. I've recently done work for a former employer, who it seems are still using SUBST'd drive Z:, just as they'd always done when I worked there nearly 20 years ago. (Main reason: we'd found it worked well at the place we'd all worked at previously...)
  The idea of everybody having the same paths for things never sat right with me, because it's an easy way for absolute paths to sneak in, which can be (and occasionally was) a problem when trying to support multiple branches later in the project lifecycle. But if you've got a CI system, which people typically do in 2024, that's much less of an issue (because you can arrange for the CI system to build from a random path each time). And it is pretty handy when you can paste universally-usable paths into the chat.
  
  Kwpolska 10 months ago
  
  > you can arrange for the CI system to build from a random path each time
  Or you can end up having to use the magic path for all builds.
  At work, we have a subst’d drive. The way we set it up means that some <evil crap> requires it (by the virtue of hardcoding, or by common configuration). But I do most of my development from a fully separate, leaner copy of the repo, and consider the subst drive to be a garbage pile.
  
  tom_ 10 months ago
  
  The idea is that you set up the CI system to build from a different path as often as possible. Per build, per build machine, per build machine rebuild - whatever works for you. Just ensure it's a different path from the standard one that the developers use. Ideally, also make sure that the path the developers use is completely inaccessible on the CI machines.
  Now the chance of introducing dependencies on the standard path is much lower. You can do it, and it will work for you, and it will work for everybody else too when they get your change. But at some point, the CI system will get it, and it will fail, and you'll know that you have something to fix.
technion 10 months ago

SUBST to this day is how you solve long file name problems. One drive for business can make a very long path if it uses your full business name. Windows has the api to let some apps apps save long sob folders, but not to let Explorer or powershell delete those folders.
You go on folder up and use subst to make a drive letter from which you can delete content.
- asveikau 10 months ago
  
  Explorer (known in MS jargon as "the shell", but I'm avoiding the term because it confuses Unix users) is limited by MAX_PATH characters when it stores absolute paths.
  Win32 allows you to use paths above MAX_PATH in length when you 1. Use the utf-16 filename apis and 2. prefix your path with \\?\.
  But the shell doesn't do the latter. It may also use fixed size buffers for paths, that is a common reason for such limitations.
codesnik 10 months ago

poor man's chroot.
johng 10 months ago

Ahh yes, subst was very handy many times back in the day and it worked like magic to me!

pram 10 months ago

Is INT 2fH the DOS equivalent of PATH? What a bizarre mechanism, I've read it 2 times and I have no idea what it's saying lol:

http://vitaly_filatov.tripod.com/ng/asm/asm_011.16.html

SeenNotHeard 10 months ago

INT 2Fh was the so-called "mux" that various TSRs and drivers could hook into for (in essence) interprocess communication. The half-baked idea was to solve the problem of TSRs commandeering other interrupts for one-off needs, which led to lots of collisions.
In order for the mux to work, each TSR had to have its own identifier code. Other than some reserved ranges, no one organized such a namespace, meaning it was possible for two or more TSRs to intercept the same request, leading to the same collision problem.
This note from Ralf Brown's Interrupt List has the gory details:
http://www.ctyme.com/intr/rb-4251.htm
Incomplete list of TSRs and drivers relying on it:
http://www.ctyme.com/intr/int-2f.htm
epcoa 10 months ago

It's just an ugly ass syscall extension mechanism (so it has no direct equivalent in Linux lets say), it definitely looks bizarre in modern times.
Int 2F is initially handled by DOS as a stub, but additional programs (like drivers and TSRs) can override INT 2F, put their bucket of functionality and then fallback to whatever the previous installed handler was (called chaining) for whatever they don't handle.
This gives a glimpse into how much various crap could end up installed as an Int 2F handler: https://www.minuszerodegrees.net/websitecopies/Linux.old/doc...
It was often used for feature/presence checks and usually nothing time critical as that chaining setup was most definitely not timing friendly.
- skissane 10 months ago
  
  > This gives a glimpse into how much various crap could end up installed as an Int 2F handler:
  Lot's of crap in INT 21 too: https://fd.lod.bz/rbil/zint/index_21.html (on the whole I like this HTMLified version of Ralf Brown's Interrupt List better)
  But I suppose they invented INT 2F to discourage people from doing that to INT 21.
  And then Ralf Brown also proposed an alternative multiplex interrupt, INT 2D: https://fd.lod.bz/rbil/interrup/tsr/2d.html#4258
  
  astrobe_ 10 months ago
  
  INT 21 was for DOS what INT 80 was for Linux: the gateway to the OS kernel.
  Overriding software interrupt handlers was a common mechanism, applied to BIOS interrupts as well (e.g. there was a BIOS timer interrupt that was there just for that).
  The idea was that programs would take over the interrupt vector then chain-call the previous handlers. One can still see something similar at play in certain scenarios with e.g. callbacks. The system/library doesn't have to maintain a list of clients.

miohtama 10 months ago

I remember wondering APPEND as a kid three decades ago. Looks like it had a very specific legacy use case, which was no longer present in more modern DOS versions. Live and learn.

troad 10 months ago

This is really neat!

Are there any good books on DOS that a person who enjoys articles like this may also enjoy?

And more broadly, does anyone have any good books to suggest about the personal computers of the 80s and 90s, that don't skimp on the messy technical details?

fredoralive 10 months ago

“Undocumented DOS” goes very deep into the technical details, if you want to know what’s going on behind the curtain.
- troad 10 months ago
  
  Thank you!

NotYourLawyer 10 months ago

Huh. I knew this command existed but never looked into it. I assumed it was like cat, just appending files together.

pavlov 10 months ago

> “In fact it is known that DOS 2.0 could not be built on PCs at all, and was built on DEC mainframes.”

Nitpick, but DEC never made a mainframe. Their products like the PDP-11 were considered minicomputers (even though the CPU was the size of a fridge) to distinguish them from IBM’s mainframes and medium sized computers.

retrac 10 months ago

DEC was always finnicky about naming; the PDP series originally wasn't supposed to be called a computer because computers were thought of as much bigger than the products DEC sold, and customers in the 50s and 60s might be put off by a name they associated with multi-million dollar expenses.
But the PDP-10 and VAX 9000 were basically mainframes. Million dollars or more. Whole large room with three phase power. Standard building AC might suffice but that was pushing the margin. And the faster clocked VAX 9000 was water cooled! That's not a minicomputer.
- brucehoult 10 months ago
  
  My PC is water cooled.
  However the VAX 9000 wasn't. It was initially designed to be water cooled but during development they improved the air cooling enough that water cooling was never shipped.
  One VAX 9000 CPU did around 70k Dhrystones/sec, so 40 VAX MIPS (DMIPS as we call them now).
  A modern simple RISC 5-stage in-order CPU (e.g. Berkeley RISC-V Rocket) with decent branch prediction does 1.6 DMIPS/MHz, so will need 25 MHz to match the VAX 9000. However the December 2016 HiFive1 ran at 320 MHz, so would be around a dozen times faster than the VAX 9000 -- but without FPU or MMU. Today, a $5 Milk-V Duo running at 1 GHz (64 bit, and with MMU and FPU and vector unit) will be 40x faster than a VAX 9000.
  The VAX 9000 has a vector unit, capable of a peak 125 MFLOPS. The Duo's C906 VFMACC.VV instruction has 4 cycles latency, for two 64-bit multiply-adds or four 32-bit multiply-adds, so at 1 GHz that's 1 GFLOP double precision or 2 GFLOP single precision, 8x a VAX 9000.
  Don't even ask what a modern i9 or Ryzen can do!
- mmooss 10 months ago
  
  > But the PDP-10 and VAX 9000 were basically mainframes. Million dollars or more. Whole large room with three phase power. Standard building AC might suffice but that was pushing the margin. And the faster clocked VAX 9000 was water cooled! That's not a minicomputer.
  Why is that not a minicomputer. From our perspective it's a massive installation; from the perspective of the time, it was not necessarily.
  
  varjag 10 months ago
  
  Minicomputer at the time was considered a system that fits in a rack or three. Not something that requires a purpose built room with own mains, raised floor and AC.
  
  somat 10 months ago
  
  I am not exactly sure what makes a modern mainframe(same architecture as a historical mainframe I guess) but for historical machines I consider the wide machines(like the 36-bit pdp-10) mainframes, where minicomputers were usually narrower. 16 or 18 bit machines.
  Really there is no good formal definition, dividing computers into three groups(mainframe, minicomputer, microcomputer) based on how much you payed for it, is as good a mechanism for figuring this out as any other.
- karmakaze 10 months ago
  
  I've used VAX and IBM systems and never heard anyone call a VAX a "mainframe". Looking into it, I get similar descriptions where they definitely were mainframe-class but refrained from calling them "mainframes" outright to distinguish them from IBMs. Other names Super VAX, VAXcluster/VMScluster etc were common. There probably were some that called their VAX computers "mainframe" but I'd never known of one--it just has the wrong connotations. The VAX systems were actually far more advanced in most ways--except for deployment scale.
mepian 10 months ago

The PDP-10 was a mainframe: https://en.wikipedia.org/wiki/PDP-10#cite_note-1
p_l 10 months ago

The PDP-11 was minicomputer, but PDP-10s were "minis" only formally, with VAX due to reasonable size and comparable performance coining the title "supermini" IIRC.
Hilift 10 months ago

It was probably a VAX 11/780. If you were cheap you purchased an 11/750. The 780 had a PDP-8 for a console processor. https://news.microsoft.com/features/the-engineers-engineer-c...
- SulphurCrested 10 months ago
  
  The console processor was an LSI-11, a PDP-11 on a chip. It hung off the inside of one of the cabinet doors. It was responsible for booting the 780 and also gave VMS access to its own 8″ floppy drive, which had a habit of overheating.
  The 750 was later technology, IIRC MOSFET. It also lacked the 780’s “compatibility mode”, which allowed VMS to run 16 bit RSX-11 executables by implementing the PDP-11 instruction set in addition to the 32 bit VAX one, and could boot without a console processor. If you didn’t need all the users on one machine, the 750 was cheaper per user.
surgical_fire 10 months ago

Nitpick, but as far as I remember, minicomputers are midrange computers. DEC PDP-11 would be in the same class as IBM AS/400, so while they were distinguished from mainframes, they were "medium sized computers".
Assuming, of course, that those "medium sized computers" are the midrange.
- jasomill 10 months ago
  
  This gets even more confusing when you consider that IBM released a number of mainframes smaller than many of their midrange systems, including both System/370 and System/390 systems implemented as PC expansion cards (ISA, MCA, and PCI).
  Ultimately, within IBM at least, "mainframe" ended up just referring to any computer that implemented the System/360 architecture or its descendants.
  Outside IBM, I've seen the term applied to just about any multiuser system accessed primarily through a terminal interface, including x86 servers running Linux, though typically by nontechnical users.