LatencyKills 3 days ago

Ex-Apple engineer here. This is, for better or worse, just the way Apple approaches this type of problem. From Apple's perspective, this is the way to preserve Finder / Gatekeeper / metadata semantics. It avoids silent data loss when round-tripping archives between Macs. This behavior also maintains consistency with copyfile(3) (as well as the Archive Utility behavior).

Apple treats tar less like “portable Unix interchange” and more like “archive this filesystem object faithfully.” That is very Apple, and very libarchive. ;-)

This is probably going to get worse (as Apple continues to add macOS-specific metadata), so your workaround is very helpful.

I haven't tested it in a while, but at one point, setting the COPYFILE_DISABLE=1 env variable would disable the inclusion of macOS-specific metadata.

  • Terretta 1 day ago

    Arguably, principle of least surprise is very Apple.

    If I point "tape archive" at a file system, I want that file system archived to tape. And so, tar does.

    If I don't, well, that's a fine option, and there's a fine option for that.

    So it's less of a "workaround" or something that "gets worse", than, "No, I don't really want a tape archive of this filesystem, only of some of it." And that's supported.

    That said, never seeing another .DS_Store should be a system-wide option!

    • taftster 8 hours ago

      > That said, never seeing another .DS_Store should be a system-wide option!

      Yes please.

      • ryandrake 7 hours ago

        .DS_Store, .fseventsd, .Spotlight-V100, .Trashes, and ._this and ._that

        These can all die in a fire too, as far as I am concerned. macOS loves to treat the user's filesystem as its own personal garbage dump.

        • gerdesj 7 hours ago

          thumbs.db and those weird MS alternative stream files for recording origination.

          filesystem attributes are for decorating files with meaning. Anything else that attempts to use filesystems in "interesting" ways is silly.

          Apple and MS really ought to consider why they do this sort of fragile, idiosyncratic nonsense.

          • Joker_vD 7 hours ago

            But... thumbs.db is precisely not an "attempt to use filesystems in "interesting" ways" — it's literally a just hidden file with previews stored in it. Storing the preview in the alternative stream of the file with the picture itself would be "an interesting way".

            • kstrauser 6 hours ago

              Agreed. Where else would you put that stuff? It’s gotta go somewhere, and this is the least surprising place IMO. Anywhere else would have to be a parallel store that follows filesystem mounts and unmounts, renaming directories, etc so that it alway perfectly mirrors the thing it’s configuring.

            • mook 3 hours ago

              In the particular case of thumbs.db, storing them in NTFS alternate data streams would have been a good idea; they're essentially caches for the main data stream, so if they fail to copy to different filesystems it's totally fine. Of course, that wasn't viable because 1) IIRC that was before the widespread adoption of NTFS, and 2) they probably still need the cache somewhere for vFAT USB drives.

        • emmelaich 6 hours ago

          OTOH, If you want the information contained in those files, where else would you save it?

          • ajxs 5 hours ago

            To me it seems more sensible to store information relevant only to this OS in a specific cache somewhere within that OS. It would even make cache-like functionality such as evicting old entries super easy.

            • Gigachad 1 hour ago

              There are some tradeoffs. Like if you used a usb and set up folder colours or any of the other things stored in the file, they would not move along with the usb when used on another computer.

    • JoshTriplett 7 hours ago

      > Arguably, principle of least surprise is very Apple.

      Principle of least surprise is good engineering practice. The question is always whose surprise. Someone who expects tar to behave like other UNIX systems is going to be surprised by this. Someone who expects tar on Apple to have perfect fidelity would be surprised by not-this.

      I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies. You can't have this problem if your packaging system pulls in a specific portable `tar` library.

      • amarant 6 hours ago

        Nixos has a pretty solid solution to this issue: key your dependencies with checksums of the content. That way you get the best of both worlds: you always get the exact version you want, and you can share a copy of that exact version with other software that wants to use that exact version too!

        • JoshTriplett 6 hours ago

          Yeah, Nix-like distributions (e.g. guix, lix) do for Linux systems what some language package managers (e.g. cargo) do for individual projects.

        • altairprime 4 hours ago

          Are the xattr / chattr / umask checksums rolled into the main data fork content or are they hashed separately (or not at all)?

          • a_t48 2 hours ago

            IIRC Nix is checksummed in the hash of the source of the content, not the results.

            • microtonal 1 hour ago

              Hash of a normalization of the derivation, so this roughly means source, dependencies and the ‘build recipe’. The exception are fixed-output derivations, which are typically content-hashed.

              That said, a lot of work is done in content-addressed hashing, but AFAIK it’s not the default yet.

        • dented42 3 hours ago

          So it sounds like you don’t get the exact version you want because metadata is thrown away.

      • Joker_vD 6 hours ago

        > I increasingly feel like build systems should never be relying on any "native" utilities from the host system, and should instead be bringing them in via dependencies.

        Well, you see, while this, frankly, applies not just to build systems but to most of software, the consensus in the community of distro-maintainers is that it's actually wrong: you should use your system's package manager, and tools it can install, and let it fiddle with the ambient environment and give you that delicious "path dependency". And if your distro's packaging environment doesn't allow to do the things you need (e.g. being able to install both mongodb 3.8 and mongodb 5.0, ideally at the same time, but okay, I can keep running apt remove/install over and over, but I do need to check if my app correctly handled the wire protocol changes), well, that's your problem for desiring strange things.

      • crazygringo 4 hours ago

        > The question is always whose surprise.

        I think that the surprise of more data than expected is more desirable than the surprise of data loss. So in this case, it seems like the safe choice.

        • dlenski 2 hours ago

          Agreed. I usually hate on Apple, and its terribly ancient utilities and gratuitous incompatibility with modern Linux utilities, motivated by hatred of the GPL license.

          But in this case, I think what it's doing is… basically fine? "Tar should faithfully reproduce the semantics of the source filesystem" is a perfectly reasonable starting point.

          Ideally there would be a documented way to turn off the Apple-specific metadata with Apple's own tar, though.

          • saagarjha 1 hour ago

            From tar(1):

                 --no-mac-metadata
                         (x mode only) Mac OS X specific.  Do not archive or extract ACLs
                         and extended file attributes using copyfile(3) in AppleDouble
                         format.  This is the reverse of --mac-metadata.  and the default
                         behavior if tar is run as non-root in x mode.
      • Arainach 2 hours ago

        Apple is always surprised that non-Apple devices exist.

        See: the permanent undismissable red icon to "finish setting up your Apple TV with your iPhone"

        • simianparrot 50 minutes ago

          Apple can't control non-Apple devices. They can only control their own. So this makes perfect sense.

      • Someone 29 minutes ago

        > Someone who expects tar to behave like other UNIX systems is going to be surprised by this

        They shouldn’t p. The GNU tar manual already shows this behavior. https://www.gnu.org/software/tar/manual/html_node/What-tar-D...:

        Because the archive created by tar is capable of preserving file information and directory structure, tar is commonly used for performing full and incremental backups of disks”

        And yes, that same page also says:

        “You can create an archive on one system, transfer it to another system, and extract the contents there. This allows you to transport a group of files from one system to another.”

        > You can't have this problem if your packaging system pulls in a specific portable `tar` library.

        You can’t pull in specific portable stuff all the way down (not even when running in Docker or a VM), so that will decrease the risk, but it cannot completely remove it. As an example, I think GNU tar will happily include .DS_Store files in archives.

    • saghm 2 hours ago

      If you think that most people who run the tar command are assuming it will work like a tape archive, you'll probably be the one surprised

  • matheusmoreira 7 hours ago

    It's a good attitude to have, in my opinion. Portability is overrated. Linux developers should be doing a lot more of this. We should be making everything work better for us without caring how it's going to impact other irrelevant platforms. Let the people who actually care about those platforms worry about such things.

    • cozzyd 7 hours ago

      It would at least be nice if there was a way to keep apple users from shitting all over the filesystem with remote mounts and ds_store files. Perhaps by automatically unmounting if one is detected.

      • bombcar 6 hours ago

        At least if you're using ZFS as the backing store and Samba, you can set vfs objects = catia fruit streams_xattr and similar config options to use extended attributes.

      • seqastian 1 hour ago

        defaults write com.apple.desktopservices DSDontWriteNetworkStores true

    • messe 4 hours ago

      > Linux developers should be doing a lot more of this. We should be making everything work better for us without caring how it's going to impact other irrelevant platforms

      Linux developers already do. Using a BSD can already be a pain in the arse, thanks to (often poorly thought out) Linux-isms cropping up everywhere.

    • Gigachad 1 hour ago

      Portability of tar archives at least. We should have some like .zip which are standardised and allow some like tar to be faithful replicas of exactly how the OS stores data.

      • gjadi 39 minutes ago

        Except that zip does not preserve permissions.

        • Gigachad 29 minutes ago

          That seems fine to me. I’ve never cared about permissions in a zip. Zip these days is primarily for exchanging a directory as a single file to another person. Permissions wouldn’t work across computers anyway.

          If you want a faithful archive of the data then a tar archive or disk image is what you want.

  • jmclnx 7 hours ago

    To me, the big question is why Apple needs all these file attribute ? If the files are extracted OK, just ignore the errors :)

    • bombcar 6 hours ago

      Apple has had multiple streams per file since the very beginning, and it can store useful and necessary information (the latter is quite rare now, as most things have sane defaults, but losing the extended attributes can lose things that can be annoying).

  • hamasho 1 hour ago

    Funnily enough, I got the error message and asked Claude Code, and it replied;

        The warning can be suppressed by `--no-xattrs --no-mac-metadata`.
    

    then just edited the code as

        -  tar czf dist.tar.gz dist
        +  COPYFILE_DISABLE=1 tar czf dist.tar.gz dist
pier25 8 hours ago

I use these settings when creating a tar file for deploy:

    tar --no-xattrs --no-mac-metadata -czf
  • jherskovic 6 hours ago

    I do this same thing too when building archives in macOS I will unpack on Linux later.

red_admiral 24 minutes ago

Why switch to a completely different tar and rewire the PATH when you could just set a shell alias? You'll need to edit .bashrc both times but there's no need to install a second tar to /opt to solve this.

throw0101a 6 hours ago

Per this 2018 page, GNU tar seems to work with SCHILY.* encoded xattrs, but not LIBARCHIVE.* ones:

* https://mgorny.pl/articles/portability-of-tar-features.html#...

* Via: https://github.com/mxmlnkn/ratarmount/issues/145

bsdtar ≥3.7.2 apparently adds both types to its files for maximum portability:

* https://github.com/libarchive/libarchive/pull/691/files#diff...

AFAICT, bsdtar will default to "ustar" format, but will auto-switch to "pax" if needed.

  • Pay08 2 hours ago

    I wonder how come GNU tar never added them. I have to assume someone has brought the problem to their attention before.

albertzeyer 32 minutes ago

But these are not errors. These are just warnings you can ignore? It's not really so critical?

seba_dos1 2 hours ago

> Why does it have those extra files?

> For some reason

Very informative!

raffraffraff 2 hours ago

Would this ever affect me if I don't use many of MacOS built on tools? I brew install gnu equivalents make them all default. Just like how I also don't use most of their desktop environment stuff, and instead use rectangle, hammerspoon, karabiner to make it feel more like the Linux desktop I wish I could use at work.

chmaynard 7 hours ago

Homebrew installs GNU tar as "gtar". On my M4 MacBook:

  $ which gtar
  gtar is /opt/homebrew/bin/gtar
  • fastily 6 hours ago

    Ive installed the gtar formula and aliased it to tar. Cant be bothered to memorize the differences between macOS tar and unix tar, especially when the latter is considered to be the de facto standard

angry_octet 7 hours ago

We might also ask, why doesn't Linux also track such meta-data? Are Linux users not also subject to drive-by downloads impersonating valid files? Should we be one chmod a+x away from compromise?

  • danielheath 7 hours ago

    Yes, we should be.

    My computer should run programs when I tell it to run them.

    Don’t blunt _every_ tool just to make them harder to cut yourself on.

    • angry_octet 7 hours ago

      I hope you're in the very small minority of people who rigorously manage untrusted downloads and whitelist every binary, because you're operating an appliance from the 1970s, sticking a metal fork into an un-earthed toaster. Most people need help from their operating system.

      • b65e8bee43c2ed0 3 hours ago

        then we, the very small minority, want a button to disable that help.

    • Joker_vD 7 hours ago

      I sincerely agree. By the way, thanks for lending your machine for my "Network-Retransmission-and-Compute-as-a-service" network.

    • rtpg 5 hours ago

      Increased metadata isn't tool blunting in itself though, even if MacOS uses it for being... annoying is one way of saying it.

      Provenance information bundled into a file is not the worst idea in the world IMO. We have created/modified timestamps on files already, right? There's definitely the question of "why" but hey if more of my binaries just had at least a tag about who put them there that would be a win in my book.

      Not an argument for doing what MacOS does, just an argument that the info would be nice to have.

    • danishanish 4 hours ago

      It’s not blunting a tool, it’s sheathing it. Modern software requires too much proxied trust for this attitude to work.

  • bitfilped 7 hours ago

    Should I be able to run files I download on my own computer? I think yes I should, hate fighting MacOS to do simple tasks because Apple engineers assume the end user has the average intelligence of an ostrich.

    • shawn_w 3 hours ago

      That might be an overly optimistic assumption for the typical user, to be fair.

  • emmelaich 6 hours ago

    Tar on linux will. e.g. selinux attrs and other xattrs.

    Open question, is it worth attempting to main these semantics between mac and linux.

    • worthless-trash 1 hour ago

      No,

      I just assume apple will break the behavior when they want to.

bombcar 6 hours ago

I'll admit that if I don't care about extended attributes (I never really do) I just use zip instead.

  • chungy 1 hour ago

    I have bad news for you: Zip supports storing extended attributes as well.

firesteelrain 8 hours ago

You can either send stderr to /dev/null or use --warning=no-unknown-keyword to suppress them cleanly.

But still interesting nonetheless why they are added

anthk 1 hour ago

How well does pax handle this?