Module I of IV
The Practice Is Changing
A line that anchors how to read the rest: "The people closest to AI aren't making predictions. They're reporting what already happened to them." — Matt Shumer, 2026. The phase where this was a forecast is over. The phase where it's a report is well underway.
For law, Zack Shapiro's Claude-Native Law Firm was the moment the legal version of the report became concrete. His thread went viral in February — over seven million views as of April 2026 — not because the argument was unfamiliar but because the artifact was: a working transactional practice, running on a system that looks almost nothing like what law firms have looked like for fifty years.
Shapiro is the vanguard, not the center of gravity. My read — informed by the firms I talk with — is that BigLaw's median practice is twelve to eighteen months behind him. The AI future of law is already here, it's just not evenly distributed. Watch how fast he got there, and what he was willing to abandon to get there. The direction becomes clear; only the pacing is open.
Module I · Learning objective
By the end of this module you will be able to calibrate the gap between the vanguard and the median of current legal AI adoption — and make your own call on how quickly that gap will close.
~45 minutes of reading + optional Try-It exercise
Something Big Is Happening
Matt Shumer's widely-read essay on the AI moment. The line that anchors the whole course: "The people closest to AI aren't making predictions. They're reporting what already happened to them." Read this first; everything in Module I reads differently after you have.
Read essayInside the Claude-Native Law Firm
Matt Pollins interviews Zack Shapiro about how he actually runs Raines LLP on Claude — the workflows, the security and privilege considerations, the billable-hour question. The cleanest reader-friendly version of Shapiro's argument. Read this first.
Read essayThe Claude-Native Law Firm (X thread)
Shapiro's original thread. Over seven million views as of April 2026. Harder to read than the Pollins piece but worth seeing in the original form — the compressed, bullet-point delivery is part of why the argument landed the way it did.
Read on XClaude Meets Westlaw and Lexis
Seth Chandler on how recent Claude versions can directly drive Westlaw and Lexis through browser control. Opening line: "Something remarkable has happened in the last few months, and most of the legal academy has not noticed." That observation is itself the signal.
Read analysisThe AI Future of Law Is Already Here — It's Just Not Evenly Distributed
William Gibson's line about the future, applied to legal practice in April 2026. A short, pointed reminder that the vanguard (Shapiro) and the median of the industry are different data points. Read after Pollins and before the Anthropic case study.
Read articleAI Built For Law Outperforms ChatGPT, Claude, and Gemini on Legal Reasoning
Recent bar-exam-reasoning benchmark from a legal-AI vendor comparing current frontier models: ChatGPT 5.2 at 93.41%, Claude Opus 4.5 at 89.03%, with the vendor's own product scoring higher. Self-reported vendor benchmarks are gameable — treat the specific numbers as directional. The velocity signal is what matters.
Read articleHow Anthropic's Legal Team Cut Review Times from Days to Hours
Anthropic's own legal team on integrating Claude into their workflows. Marketing review went from 2-3 days to 24 hours. Four concrete workflow types: contract redlining, marketing self-review, conflict-of-interest reviews, privacy impact assessments. A clean example of what sophisticated institutional adoption actually looks like.
Read case studyAsk the NotebookLM
Try these prompts in the course NotebookLM
- Compare Shapiro's claim that general-purpose Claude beats specialized legal AI against Ambrogi's benchmark showing a purpose-built legal tool scoring higher. Where do they actually disagree?
- Based on the Anthropic case study and the Slaw "not evenly distributed" piece, construct a best-case and worst-case timeline for BigLaw median practice catching up to Shapiro's setup.
- What are the three most specific operational capabilities Shapiro has that a traditional firm doesn't? Ground your answer in the Pollins interview.
Exercise · Hands-on
Put the thesis to a working test
Take the sample commercial lease (pedagogical artifact, not a real lease). Paste or upload it into your AI of choice — Claude, ChatGPT, Gemini, or a legal-specific tool if you have one. Ask it to do three things you'd normally have a mid-level associate do:
- produce a 200-word executive summary for a tenant-side client;
- identify the three provisions most unfavorable to the tenant;
- propose redline edits a tenant could use as a negotiation starting point.
Then evaluate. What did the AI get right? What did it miss? What would you still need a senior lawyer for? The exercise turns Shapiro's thesis from abstraction into a concrete calibration — what Claude-native practice actually looks like on a real piece of work.
Module II of IV
What This Means for Legal Education
Scholarship is slower than the models. The Minnesota/Michigan RCT tested o1-preview and a mid-2024 version of Vincent AI; by the time it reached wide circulation the field had moved two model generations forward. The empirical results remain useful — they tell you what's possible, what the failure modes look like, what trained users can actually extract — but the specific performance numbers should be read as lower bounds, not current state.
What the scholarship is good for is direction. The empirical work in this module points the same way from a few different angles: recent models materially improve legal work quality when the workflow is designed well, and materially hurt it when the workflow isn't. The question has moved from "does AI help?" to "what workflow design extracts the value and avoids the hallucination risk?" That's a pedagogical question as much as a practice question.
Module II · Learning objective
By the end of this module you will be able to separate what scholarship has actually established about AI in legal work from what's still hypothesis — and identify the pedagogical design choices the established findings point toward.
~60 minutes of reading + optional NotebookLM session
On Working with Wizards
Ethan Mollick's shift from "co-intelligence" to "wizards" — the observation that recent models are increasingly opaque, producing sophisticated outputs through processes users can't see into. The best short framing of what it's like to actually work with current AI. Read this first.
Read essayOn AI, Universities, and Higher Education
Jesús Fernández-Villaverde, a Penn economist (SAS), wrestling publicly with what AI does to the case for traditional higher education. A sequence of threads from September 2025 to March 2026 — "Is AI the biggest change in education since the printing press?" (yes), followed by "twelve arguments for traditional higher education" and an evaluation framework. A Penn colleague working through our problem in real time.
Read thread seriesAll In: Embedding AI in the Law School Classroom
Gregory Duhl on embedding AI throughout a required doctrinal course — reimagining legal education by integrating AI as a learning enhancer rather than a threat to be managed. A concrete peer-institution account of what curriculum-level AI integration looks like.
Read articleWhat Most Law Schools Are Doing About AI
My read of where U.S. law schools sit on AI integration in early 2026 — drawn from Bloomberg Law's 2026 Path to Practice survey, recent peer-school announcements, and conversations with administrators at other schools. Four visible patterns (electives, embedded integration, certificates, policy updates) and two observations about why public signal isn't a reliable measure of institutional depth.
Read my summaryAI-Powered Lawyering
Minnesota/Michigan-led randomized controlled trial testing whether recent AI tools (RAG + reasoning models) materially improve legal work. Findings: statistically significant productivity gains of 50–130% across five of six legal tasks, with real quality improvements over earlier studies of GPT-4. The paper tested mid-to-late 2024 models; read the direction, not the specific numbers.
Read my summary Read on SSRN →Grading Machines: Can AI Exam-Grading Replace Law Professors?
Six-author empirical study of AI performance grading real law school exams across four subjects at top-30 U.S. schools. Pearson correlations between AI-assigned and faculty-assigned grades up to 0.93 when AI is given a detailed rubric. The patterns of disagreement are more illuminating than the headline finding.
Read my summary Read on SSRN →Turning Risks of Cheating with AI into Opportunities for Better Teaching
John Lande reframes the AI cheating problem as a teaching-design problem. The core move: if students can get AI to produce the answer you were going to grade, the problem isn't the AI — it's that the assessment was measuring something an AI can do. Redesign the assessment.
Read my summary Read on SSRN →Can AI Hold Office Hours?
Ouellette et al. test AI models on answering 185 law-school questions about a patent-law casebook. Tests GPT-4o, Claude 3.5 Sonnet, and NotebookLM — the same tool stack used to build the very NotebookLM on this page. Read it and then chat with this course's NotebookLM to calibrate where reliability actually sits now.
Read my summary Read on SSRN →Large Language Scholarship
Frazier and Rozenshtein argue that generative AI is already reshaping legal academia itself — law review submissions polished or partly generated by AI, articles written in weeks rather than months, detection practically impossible. Their prediction: disclosure rules won't hold, and scholarly norms will shift toward accepting AI-assisted work as routine. The outlier in this module — every other piece is about AI in legal *practice* or *pedagogy*; this one is about AI in legal *academia*.
Read on SSRN →Ask the NotebookLM
Try these prompts
- The Schwarcz RCT tested models from mid-to-late 2024. If you extrapolated the gains proportionally to Claude Opus 4.5 or GPT 5.2, what would you estimate the current gains to be? Where does the extrapolation break down?
- Compare Duhl's "All In" approach to the pedagogy pillar of the proposed Penn AI initiative. What does each get right that the other misses?
- Lande argues we should redesign assessments rather than ban AI. The Cope et al. "Grading Machines" paper shows AI grades correlate with human grades at 0.93. Take both seriously — what would a redesigned assessment actually measure?
Exercise · Hands-on
Replicate the Ouellette methodology
Use the sample commercial lease from the Module I exercise. Apply Ouellette et al.'s "Can AI Hold Office Hours?" methodology: write five specific factual questions a student or junior associate might ask about the document — "what's the notice period for renewal?", "which party bears HVAC replacement costs?", "what triggers the personal guarantee?". Ask your AI of choice to answer each using only the lease text. Grade using Ouellette's three-way rubric: fully correct / partially correct or minor error / harmfully wrong. How does your tally compare to Ouellette's 14–31% harmful-response rate from late 2024? Is the counterweight paper's warning still warranted, partially, or has the field moved past it?
Module III of IV
What We're Actually Building
Modules I and II map where the field stands. Module III describes the work underway at Penn.
Three tiers. At the institutional level, I'm drafting an AI-initiative proposal for Penn Carey Law — organized around the bet that the right response to fast-changing technology is institutional adaptive capacity, not a specific program bet. The proposal is currently under discussion with the Dean and the development team. At the curriculum level, programs and courses are already integrating AI intentionally — Legal Practice Skills has redesigned the 1L sequence; the AI Law Lab (which I lead on pedagogy) runs two bootcamp tracks. At the infrastructure level, I've been building tools and skills any faculty member can use — Heron (a Claude Code-built teaching assistant), and an open-source skill collection. Shapiro built a Claude-native law firm. The pedagogical analog is what I've been building at individual scale.
Module III · Learning objective
By the end of this module you will have a concrete picture of what institutional, curricular, and infrastructural responses to the AI transformation actually look like — and the specific question to bring back to your own institution.
~45 minutes across three tiers
Tier 1 — Institutional
The forward-looking bet: a five-year initiative designed around adaptive capacity rather than a single program.
Building the Future of the Legal Profession
A public summary of a proposed Penn Carey Law AI Initiative — four pillars (Research, People, Building, Pedagogy), the adaptive-architecture thesis, and the five-year direction. Described at a level appropriate for a general audience; the full concept document is in discussion with the Dean and development team.
Download summaryForging the Future: AI at Penn Carey Law
Penn Carey Law's public account of its AI work — AI integration in the 1L Legal Practice Skills curriculum, the AI Law Lab, and institutional access to ChatGPT EDU and Harvey AI for students. External news coverage of what the school is doing on AI.
Read articleLegal Tech Lab (working title)
A full-year 4-credit course launching in AY 2026-27. A small cohort of students building access-to-justice tools with community partners — focused on what scalable AI can actually do at the access-to-justice frontier. The goal is students who don't just think about the law but build with it.
AI 1L Orientation — Fall 2025
The AI session I co-run in the incoming 1L orientation — what Penn provides, permitted use as the school currently frames it, and the ethical questions I want students thinking about from day one. One piece of the institution-level work law schools are working out right now.
Download slidesTier 2 — Curriculum-Level Integration (Spring 2026)
Concrete examples of AI being integrated into real courses at the curriculum level — not AI as a topic in an elective, but AI woven through the pedagogy of courses students take whether they elect it or not.
What Curriculum-Level AI Integration Actually Looks Like
A summary of Penn Legal Practice Skills' approach to integrating AI into the 1L curriculum. The key move is sequencing: AI-free baseline → structured instruction → three anchor experiential modules → reflection. Shared-platform rationale (transparency + equity + FERPA) follows from the design. The pattern generalizes beyond LPS.
Download summaryAI Law Lab Bootcamps — Corporate & Litigation Tracks
The AI Law Lab's Spring 2026 bootcamp sessions. Two parallel tracks (Corporate, Litigation) — track separation matters because corporate and litigation workflows diverge enough that a generic course shortchanges both. Case-file method, practitioner integration, shared ethics thread through every session.
Download summaryAI Law Lab Bootcamp — Q&A with the Instructors
Penn Carey Law's news feature on the Spring 2026 bootcamp, told from the instructors' perspective. Meghana Bhimaro and Lakshmi Prakash (both L'25 W'25) describe the course's design philosophy — technical competency plus the judgment to know where AI adds value and where it falls short. An external view that complements the internal syllabus summary.
Read articleTier 3 — Presentations and Infrastructure
The talks I give, the tools I've built, and the skills any faculty member can use or adapt. Running infrastructure and working pedagogy — what I'd actually hand another faculty member.
Something Big Is Happening — AI and the Future of Legal Education
My most recent flagship talk on this subject — delivered February 19, 2026. Opens with Shumer's viral essay, walks through the agent-vs-chatbot shift, demonstrates Claude / Cowork / Claude Code live, and closes with the Slackbot and the exam-grading pipeline. The closest thing to a single-deck version of this whole course.
Download slidesTeaching with Generative AI — Fall 2024
Earlier faculty retreat slides on practical AI use in law teaching — the classroom pilots, the ethical and legal questions, what to try. Useful as a companion to the Feb 2026 deck: the same subject, eighteen months earlier. The gap between the two decks is itself data about how fast this has moved.
Download slidesHeron — the Intro to IP Teaching Assistant
A Slack bot I built for my Intro to IP course. Students know it as Heron — named for Heron of Alexandria, the ancient mathematician who bridged theoretical knowledge and practical invention. Answers student questions with citations back to the assigned course materials. Uses RAG with embeddings in a vector database. In this course's usage, students reach for Heron far more often than Canvas for Q&A.
Built entirely with Claude Code, by me. No engineering team, no vendor. The parallel to Shapiro's Claude-native law firm is direct: Shapiro built his practice around Claude; I built the teaching infrastructure for my course around Claude Code. Both are available to individuals now. The demonstration is narrow but concrete: course-specific AI teaching infrastructure no longer requires a vendor.
Meet Heron (student-facing doc) View repolaw-faculty-skills — The Claude-Native Law Professor
An open-source collection of Claude Code and ChatGPT skills for law faculty teaching tasks — MCQ generation, essay exam generation, class problem creation, slide review, full class prep, and more. Shapiro built a Claude-native law firm. This is the pedagogical analog.
View repoAI Law Lab Resource Menu
Living directory of guides: AI attribution policies, sample syllabus language, prompt engineering for legal work, legal AI tool overviews, and how to build a Virtual TA with Custom GPTs. Maintained by the AI Law Lab; updates propagate automatically.
Read guideCreating a Virtual TA with Custom GPTs
Step-by-step guide to building a course Virtual TA using OpenAI's Custom GPTs on the ChatGPT EDU platform. Based on the one I built for Intro to IP before the Slack version.
Read guideExams Destabilized — Data Slides
Excerpt from my own Fall 2025 talk to the faculty on rethinking assessment — the "destabilized" framing is mine, not a faculty conclusion. The core data slides only: current Penn exam format breakdown (in-class vs. take-home, exam lengths, format distribution). A baseline for what's actually being assessed right now.
Download excerptAsk the NotebookLM
Try these prompts
- Compare the LPS curriculum-integration pattern (AI-free baseline → instruction → experiential modules) against the AI Law Lab bootcamp model. What's each approach optimizing for?
- Based on the proposed initiative summary, what's the bet that distinguishes "adaptive architecture" from a traditional law-school center? What specifically is being built that most centers don't build?
- What would a Claude-native professor's weekly workflow actually look like? Walk through Monday-Friday using the law-faculty-skills repo and Heron as starting points.
Exercise · Hands-on
Build a teaching artifact with AI
Use the same sample commercial lease. Spend fifteen minutes with your AI of choice building a pedagogical artifact around it. Pick one:
- a 5-question multiple-choice quiz testing whether a student understood the key terms;
- a short-answer exam question with a 150-word model answer;
- a rubric a faculty member could use to grade a student's own redline of the lease.
Evaluate what the AI produced. What part would you trust? What part still needs faculty judgment to finalize? What would you change if you were going to actually use this with students? The exercise tests the Heron thesis — that faculty can build AI teaching infrastructure at individual scale — on a real piece of coursework.
Module IV of IV
Open Questions
These are the questions I don't yet have clean answers to. Some are technical, some are structural, some are pedagogical, one is existential. They're not rhetorical setups — they're the honest state of the conversation.
I offer them for two reasons. First, because they're the questions worth arguing about in any room where legal educators and practitioners sit together. Second, because the NotebookLM on this page is seeded with every source in the course, and you can ask it to take a position on any of these questions and see where the sources push back. That's a better use of the technology than asking it to summarize — it's where the model-native conversation actually starts.
Module IV · Learning objective
Having worked through Modules I–III, you should be able to argue a coherent position on each of these six questions — not "settle" them, but hold a view you can defend to a colleague.
~20 minutes of reading + as much argument as you want
Running theme across the course
The Work Question — writing, pedagogy, and AI
Law schools have long taught by making students do the work of lawyers — cases read, briefs written, memos drafted, exams sat. The assumption, often implicit, is that the work itself is the pedagogy. That mechanism has a real virtue: it trains judgment through practice. It also has a cost: writing is slow, and the cost is what has kept repetitions low and feedback loops long.
AI changes both sides. It drops the cost of producing legal work dramatically — meaningful repetitions can rise by orders of magnitude. It also does the work, or something that looks like the work, which means a student can outsource the wrestling the pedagogy was trying to induce. Writing remains necessary; whether writing-as-traditionally-assigned is still sufficient is the open question.
The student's role in the writing changes — from producer of text to supervisor of text. Assessment has to test the supervisory judgment, not just the text. And two questions run through the rest of this course: if writing isn't the only pedagogy anymore, what else carries equivalent weight? And how do we train lawyers who can 10x their productivity with AI while maintaining the standards of careful, deep analysis the profession has always expected?
The six open questions below are different angles on that second one.
1. If AI can generate sophisticated legal reasoning, what are we testing on exams?
The traditional law-school exam — an issue-spotter, a close-reading question, a policy essay — is a proxy for "can this student do the analytical work a junior lawyer is expected to do." If the analytical work is increasingly AI-assisted in practice — and, as Module II's empirical work suggests, increasingly within reach of current models at the level of complexity a typical exam tests — the proxy starts to drift. This doesn't mean we should ban AI on exams or permit it unconditionally. It means we have to decide what we're actually measuring — capability without AI, capability with AI, or something new. In my read, most law schools are making that choice by default rather than by design.
2. Does a "Claude-native" 1L curriculum still have 1L courses as we know them?
The 1L year has looked substantially the same for generations. Civil Procedure, Contracts, Torts, Property, Constitutional Law, Criminal Law, Legal Writing. If a first-year student using Claude can produce the memo-and-brief output that the LRW program was designed to produce, the question isn't whether we still teach LRW. It's what LRW teaches that an AI-assisted student genuinely still needs — and what part is now redundant. Same question for each doctrinal course. The answer is probably "the core analytical skills remain essential, and the vehicles for teaching them change." Saying that is easier than doing it.
3. What happens to rank-in-class if AI-assisted output is indistinguishable from unassisted?
Every top school has a grade distribution that employers rely on. If the variance in student output collapses because AI smooths out the bottom of the distribution, the grade becomes less informative about student quality. The grade still sorts, but what it sorts on shifts. Firms and clerkship committees will adapt; the question is whether we lead that adaptation or chase it. The schools that lead it will have more say in what the next generation of sorting looks like.
4. Who pays for the tools — the firm, the school, the student, or do we stop caring?
Right now, at most top schools — in my observation — institutional AI provision is uneven: general-purpose tools (ChatGPT, Claude) for faculty, less consistent for students at scale; legal-specific platforms adopted unevenly. Anecdotally, students who pay for Pro tiers tend to outperform students who don't on AI-assisted assignments. This is an equity question, a curriculum question, and a budget question at once — and every school is answering it differently. The answer probably converges toward "the school pays," because the alternative is either mandated opt-out or a structural disadvantage for students who can't afford a subscription. The timeline on that convergence is unclear, and the cost compounds.
5. How do law schools maintain pedagogical identity when every school adopts the same base-model technology?
A Penn Carey Law graduate has historically been a specific thing. A Harvard graduate, ditto. Some of that is brand, but some is training — faculty, cases, clinical opportunities, peer group, specific pedagogy. If every top school runs similar AI-augmented courses on the same underlying models, what differentiates the graduates? The answer is in the specific pedagogical choices, the research culture, and the institutional values — and institutions will have to be more deliberate about those than they historically needed to be.
6. What does "professional judgment" mean when the generation step is automated but the selection step is the actual lawyering?
This may be the deepest one. Every lawyer spends their career learning what the generation step feels like — drafting arguments, spotting issues, synthesizing cases. That apprenticeship is what junior associate work has traditionally been. If the generation step becomes mostly automated, the apprenticeship has to focus on the selection step — which argument is actually persuasive, which issue actually matters, which synthesis is actually correct. That's a different skill, taught through different exercises, assessed differently. We've never taught it as explicitly as we'll need to, because we've relied on years of post-graduate practice to develop it. That runway is shortening.
Ask the NotebookLM
Take a side
- Pick any of the six questions above. Ask the NotebookLM: "Defend the view that [your position]. Cite specific sources where they support you." Then ask: "Now defend the opposite view. Cite specific sources where they support that view." The gap between the two responses tells you where the sources actually disagree.
- Ask: "Which of Wagner's six open questions is most urgent for a law school to settle in 2026? Which is most likely to resolve on its own if we wait?"
Exercise · Hands-on
Use AI to steelman the view you don't hold
Pick the open question above where your own view is weakest. Open the NotebookLM (or any AI you use) and ask it to defend the position you're least sympathetic to, citing specific course sources where they support that view. Then ask it to defend your own position, citing the same sources. Compare. Where did it find real support on both sides? Where did it paper over genuine disagreement between the sources? The AI's willingness to argue both sides equally is a feature and a trap — it tells you where the literature actually disagrees, but it can also manufacture false balance. The exercise is about calibrating both the question and the AI.
Interactive
Chat with the Course
Every source in the course — plus the Module III summaries I wrote — lives in a NotebookLM notebook. Ask it questions across the material, or watch and listen to NotebookLM's AI-generated overviews below. Use the prompts suggested at the end of each module as starting points.
Open the NotebookLM
Every source, chattable. Great for targeted questions grounded in the actual materials — "what's the Minnesota RCT's core finding?" or "how does LPS sequence AI introduction?" Publicly accessible — any free Google account works; no invitation or approval needed. The sign-in is a NotebookLM platform requirement, not a restriction on who can view.
Open NotebookLM →How AI Builds the 10x Lawyer
NotebookLM's AI hosts in a long-form discussion across all four modules, generated with the custom four-module steering prompt. Hosted here directly — no Google sign-in required.
AI and the Future of Law
NotebookLM's auto-generated video overview — same source material, visualized. A shorter complement to the audio discussion. Hosted here directly.
Caveat on output. The AI-generated audio, video, and chat are demonstrations. The sources themselves are authoritative; the generated outputs are not. If a question matters, check the source.
Project Overview
The Eleven-Minute Version
My framing of the entire course in about eleven minutes — the Shapiro moment, why scholarship lags practice, what we're building at Penn, the open questions. Listen below, or read the transcript.
~11 minutes · ~1,970 words · written in my voice for the ear rather than the eye · synthesized via ElevenLabs
University of Pennsylvania Carey Law School