Metot – Using LLMs for structural argument mapping (not just summarization)

2 points by hkcanan a day ago

Hi HN,

I'm actually a lawyer by trade, not a full-time developer.

I built Metot because generic AI summaries are often useless for academic rigour. They gloss over the logic. As a lawyer, I needed to see the skeleton of an argument, not a blurb.

So, I spent the last few months iterating on system prompts and structured output constraints to force LLMs to act as analytical engines rather than creative writers.

The "AI" difference: Instead of asking the model to "write about this," Metot uses a multi-stage prompting pipeline to:

Deconstruct Logic: It extracts the Thesis -> Premises -> Evidence chain (Argument Mapping).

Analyze Metadata: It identifies methodological contributions and citation networks (Lit Review).

Review Tone: It acts as a strict academic referee for style and consistency (Text Review).

The Output (Screenshots): I tested it on Judith Thomson’s The Trolley Problem to show it works outside of law. You can see how the tool extracted the specific "Distributive Exemption" argument structure and the "Loop Case" analysis in these screenshots: https://imgur.com/a/1EMysz1

Why I’m posting here: Since my background is strictly legal, I need feedback from researchers in STEM and Social Sciences:

Does this "Argument Mapping" structure hold up for your field's papers?

Where do the LLM hallucinations creep in for your specific domain?

Privacy Note: No user data is used for model training.

Invite Code: Use HACKERNEWS to skip the waitlist.

Link: https://metot.org

MrCoffee7 a day ago

It looks like metot.org uses a vote counting type approach to evaluate how many PRO and CON arguments were made and then uses the vote count to evaluate the strength of the argument. There are a lot of problems with this approach. Vote counting fails because: Epistemic naivety: Ignores evidence quality ; Structural blindness: Misses dialectical interactions; Semantic impoverishment: No context, no warrants, no hedging; Temporal insensitivity: Static snapshots of dynamic discourse; Fallacy tolerance: No rhetorical/logical error detection

Proper system requires: Deep NLP: Discourse parsing, semantic role labeling, entailment; Structured reasoning: AAF, probabilistic argumentation, Bayesian aggregation; Domain knowledge: Evidence hierarchies, causal inference, statistical meta-analysis; Explainability: Attention visualization, counterfactual reasoning, gradient-based saliency

  • hkcanan a day ago

    Thanks for the feedback, but characterizing this system as “vote counting” is incorrect. Metot’s argument analysis uses a fundamentally different methodology. What We Actually Use:

    1. Toulmin Model Analysis Each argument is analyzed for its full structure, not just PRO/CON:

    • Claim: The specific assertion

    • Evidence: Supporting facts, data, sources

    • Warrant: The reasoning connecting evidence to claim

    • Strength Score: 1-10 based on evidence quality, warrant clarity, and fallacy presence

    2. Dialectical Mapping with Recursive Response Structure Contrary to “structural blindness,” our system tracks how arguments respond to each other recursively:

    Argument 1 (Supporting) └── Response 1.1 (Opposing - objection) └── Response 1.1.1 (Supporting - rebuttal) └── Response 1.1.1.1 (Opposing - counter-rebuttal)

    This captures unlimited depth of dialectical exchanges.

    3. Logical Fallacy Detection Contrary to “fallacy tolerance,” we detect: circular reasoning, ad hominem, straw man, false dichotomy, hasty generalization, and others.

    4. Context-Aware Type Assignment Argument type (supporting/opposing) is determined relative to the author’s thesis, not absolute. If the author criticizes Theory X, arguments against X are classified as “supporting.” This addresses semantic context.

    5. Self-Validation Layer Before output, the system validates:

    • Argument count (academic texts typically have 5-15+ distinct arguments)

    • Depth check (most academic texts have 2-4 levels) • Balance check (detects one-sidedness)

    • Type accuracy verification What We Acknowledge: • Single-pass analysis (no iterative refinement yet) • General academic analysis rather than domain-specific ontologies

    I appreciate critical feedback, but the system is not vote counting. Feel free to test with a demo account.