Manual labeling

Hand-classify the corpus.

Pull a random unlabeled lookalike, inspect its metadata, classify it as scam / legit / unsure / skip. Labels accumulate into a per-layer precision estimate that's the empirical anchor for the paper. Aim for 100+ labels per layer for a defensible 95% CI.

layer filter: live only:

loading sample...

Progress

loading stats...