bigshik an hour ago

Nice work—this hits a real pain point with Parquet. My main use case is debugging partitioned datasets on S3 with schema drift and skew, where I care about: which files/partitions have schema mismatches, weird row-group stats (all-null, out-of-range, huge skew), and doing that via metadata only.

Right now parqeye looks mainly single-file focused. Do you have plans for a “dataset mode” that takes a dir/S3 prefix and surfaces per-file/row-group summaries (row counts, min/max, null %, schema diffs vs a reference file) using just Parquet stats so it scales to tens of GB? Or do you see parqeye intentionally staying a single-file inspector?

papers1010 4 hours ago

It’s crazy how long we’ve gone without a tool like this. This is huge. Thank you for finally building this!

  • 0cf8612b2e1e 2 hours ago

    It is really incredible how poor the parquet tooling has been for years. The cornerstone of data engineering, yet just inspecting a file is needlessly clunky.

lolive 4 hours ago

Can DuckDB be included in the tool, so you can run queries directly from the UI? [that would avoid opening DBeaver whenever you need that kind of feature]

jspanos2 an hour ago

This is very impressive. Look forward to using this

dionian 22 minutes ago

tried it out. love it.

swety101 an hour ago

Such a cool idea!! So helpful

banga 3 hours ago

Looks like a nice tool, but failed for me when reading a geoparquet file created using duckdb.

lolive 4 hours ago

Apart from some visual glitches, this is an INSTANT BUY !

Note: must the Windows binary really be 78MB ?

  • ch2026 2 hours ago

    CLIs are bulky

WorldPeas 6 hours ago

thank you so much! this was an annoyance of mine for so long. edit: any chance you make a brew package? if you'd like I'd be happy to PR it in.

  • kaushiksrini 5 hours ago

    yep! it’s available as a homebrew tap — you can install it with: `brew install kaushiksrini/parqeye/parqeye`

    • dacox 29 minutes ago

      awesome! i was just looking at a bucket full of parquet files from last year trying to recall some things about them.

      i tried to install with brew, but it told me my cli tools were "too out of date". Never seen that before! and also just upgraded.

      Will try again tomorrow