Assistant Director of Innovation, Research & Instruction
Former corporate bankruptcy partner, current law librarian
I practiced corporate bankruptcy for 10 years and now teach legal research and legal technology as a librarian. I’m also a huge advocate for open access to legal information and am on the board at Free Law Project, so enjoy exploring what's possible with open data. I'm fascinating by the Legal Quants project because I had *so many* of these little ideas when I was practicing, where I could never find quite the right piece of software.
I built Verify and Retrieve to check legal citations against CourtListener's API. It catches hallucinated case citations from AI tools by verifying whether a citation actually exists and belongs to the case it claims to. Existing citation-checking tools often flagged unpublished opinions and PACER-only documents as "not found," so I built a three-step pipeline that goes deeper — first checking the citation lookup API for an exact match, then falling back to fuzzy opinion search, and finally searching RECAP docket data to find orders, memoranda, and other documents that only exist in PACER. It returns a confidence score with specific diagnostics explaining any mismatches (wrong court, wrong date, name doesn't match, etc.). I also wanted an easy way to pull full text for a batch of opinions at once, so once citations are verified you can download the matched opinion text or PDFs directly from the results page — paste in a list of citations, and walk away with the source documents.
Bankruptcy Docket Q&A is an open-source RAG (Retrieval-Augmented Generation) tool I built that lets users ask natural-language questions about federal bankruptcy cases and receive AI-generated answers grounded in actual court filings. It pulls docket data and documents from Free Law Project's RECAP Archive via the CourtListener API, classifies and indexes them using ChromaDB, and routes queries through a smart retrieval pipeline that selects relevant document chunks based on question intent. Answers are always cited to specific ECF docket entry numbers, ensuring traceability back to the source filings. The stack is Python, Streamlit, ChromaDB, and Claude (Anthropic), with support for incremental PACER document purchases when deeper analysis requires documents not yet in the public archive.
Women talkin’ ‘bout AI
In this episode of Women Talkin’ ’Bout AI, we sit down with Rebecca Fordon — law librarian, professor, and board member of the Free Law Project — to talk about how generative AI is transforming legal research, education, and the meaning of “expertise.” Rebecca helps us cut through the hype and ask harder questions: What problem are we really trying to solve with AI? Why are we using certain tools, and do we even know what data they’re built on?
ABA Journal
AI tools are genuinely useful for legal research, but we jumped to adoption before building the verification layer. Verification infrastructure should come standard.
Law librarians spend a lot of time testing and reviewing AI tools other people build. That's valuable, but we're leaving something on the table. Librarians understand the research process at a granular level, we understand organization and retrieval of information, and we know what's missing in the tools we use every day. This makes us uniquely positioned to build those tools, not just critique them.
So many of the tools I've built rely on CourtListener's open API. You'll see this across the legal tech infrastructure, particularly in vibe-coded projects. We're using open source software packages, open source data, sharing ideas, etc. When legal data is locked behind paywalls, innovation is limited to companies that can afford access. Open data lets anyone build, and the tools that come out of it serve people who can't afford $500/month subscriptions, but also serve people who are being served quickly enough by those $500/month behemoths.