CardCite

From any page to a finished citation.

CardCite reads any webpage or PDF, fills in the publisher, date, title, author, and URL, and produces a properly formatted citation that pastes into your doc with formatting intact.

CardCite demo.mp4
01 / What it does

One click between a page and a finished citation.

CardCite reads the page you're already on, pulls out the parts of a citation you'd otherwise hunt for, and assembles them in your preferred format. Paste straight into Google Docs or Word with every font size and style preserved.

To use it on any page or PDF, click the puzzle piece icon in your browser toolbar, then click CardCite. You can also pin it to your toolbar (shown in the video above) so it's a single click.

02 / Source descriptions

Six hundred source descriptions saved

When CardCite recognizes the site you're on, it fills in a short description of the outlet automatically, drawn from a database that covers news outlets, major publications, think tanks, academic journals, and government bodies. This dataset will continue to be updated, with new sources tailored to upcoming resolutions.

Evidence Citation
Bipartisan Policy Center
The Bipartisan Policy Center is a nonprofit think tank founded in 2007 by former U.S. Senate Majority Leaders Tom Daschle, Howard Baker, Bob Dole, and George Mitchell. It develops policy recommendations through a bipartisan process involving former government officials, policy experts, and stakeholders.
03 / Date detection

Dates, pinned precisely.

CardCite uses a layered approach to identify the correct publication date: it reads structured page metadata first, then cross-references any candidates against the date visible on the page. It recognizes dates in a wide range of formats and abbreviations.

CardCite defaults to the Published date, but uses the Updated date if both are present and they differ. When the two dates fall on the same day, the Published date is used without an Updated label. As a last resort, if there is no date in the metadata, the extension scans the page text for dates next to keywords, such as published, updated, or posted. If a date is found in this way, it is used but with a warning, as, although rare, it is possible that an unrelated date will be picked up in this scan. Before scanning, the extension removes sidebars, footers, related content, and similar noise elements to avoid picking up dates from unrelated parts of the page.

Page and/or metadata shows
CardCite cites
Published date only
Published date
Updated date only
Updated date Updated
Published and updated dates match
Published date (editable in options)
Published and updated dates differ
Updated date Updated (if shown on page)
No published or updated date visible on page
Published date from metadata
No date in page metadata
First date following a "Published" or "Updated" label in the page textFlagged
03.1 / Accuracy testing
96/100

Before launching, the date detector was tested on 100 webpages. The websites covered ten widely varied subjects with ten pages per topic, and every page came from a different source. PDFs were not part of this test.

On 96 of the 100 pages, the extension returned the correct date. I counted a result as correct when the extension pulled the right date for the article, and also when it returned no date for a page where none was available. The four pages where it got the date wrong have since been fixed.

A mistake was defined as any of the following:
  1. Wrong date chosen. The modified date was used when the page only showed a published date, or a date from a sidebar or related article was picked up instead of the main article’s date.
  2. Missing date. No date was returned even though one was clearly visible on the page or present in the page’s structured metadata.
  3. Wrong format or parse. The right date was found but displayed incorrectly: garbled, raw and unparsed, off by a day, or with the month and day swapped.
  4. Wrong “Updated” behavior. The citation said “Updated” when the page did not indicate the article had been updated, or did not say “Updated” when the page did.
03.2 / PDFs

How CardCite reads a PDF.

PDFs are opened with the same engine browsers use to display them, so modern, compressed, and encrypted files can be read rather than only the simplest ones. A lighter built-in reader stands in as a backup when that engine can't open a file.

From there, two sources inside the file are read together. The first is its metadata: the title, author, dates, and producer the file records about itself. The second is the positioned text on each page, every fragment with its font size and location. This information is what lets a title be told apart from a byline, and a byline from body text.

Rather than trust any single source, CardCite corroborates what it finds: a cover date is checked against the file’s own recorded date, a recorded author against the byline on the page, and a publisher name against the site it came from. The same cross-checking is applied to both PDFs and webpages.

Each value is also tidied before it reaches the citation. Titles have HTML leftovers decoded, authoring-tool prefixes and report codes stripped, all-caps headlines set in title case, and broken letter-spacing rejoined. Authors lose footnote markers and trailing memberships, and several names are formatted into a clean list. A publisher pulled from a run-together address is re-spaced using the cover’s own wording.

01
Title
Prefers the file’s recorded title, or rebuilds one from the largest text on the cover.
02
Author
Reads the byline below the title, skipping organizations and banners.
03
Publisher
From the cover masthead, or a list of 600+ known sources.
04
Date
A date from a cover label, the title, the address, or a repeated page header; blank when none is reliable.
04 / Customization

Set it up to match your team's format.

The output is fully customizable from the options page. Reorder fields, switch brackets for parentheses, toggle the author field on or off, adjust font size by section, set your team name.

Changes save to your Google account and sync across devices.

How to open the options page
Bracket style
[Brackets] (Parens)
Include author in citation
On Off
Team name
Elliott/Yorke
Body font size
12pt
Citation field order
Publisher
Date Published
Source Description
Author
Title
URL
Accessed Date
Team Name
05 / Auto-fill, live preview

Every field, populated.

Publisher, title, author, date, and URL are all detected from the page. You can edit any field, and the citation at the bottom updates as you type, so you can see exactly what you'll paste before you paste it.

Formatting carries through: font sizes, italics, weights. Drop the citation into your document and it looks the way you set it up.

CardCite extension open on an article page with fields auto-populated.
06 / Compatibility

CardCite works on regular webpages, PDFs, and archived links from the Wayback Machine and archive.today. On archived pages, it uses the original source URL in the citation rather than the archive's own address.

Webpages PDFs Wayback Machine archive.today
07 / Options page

Tune it to your team's preferences.

To open it, find CardCite in your Chrome extensions menu and choose Options. You can also get there directly from the settings link at the top of the citation popup. Changes save back to your Google account and follow you across devices.

Step 01

Open the menu

Click the puzzle piece icon in the top right of Chrome.

Step 02

Find CardCite

Locate it in the list of installed extensions.

Step 03

Open Options

Click the three-dot menu next to it and select Options.

Step 04

Save

Adjust settings, then click Save at the bottom of the page.

Reorder fields Toggle author Bracket / parenthesis Font size per section Team name Hide source descriptions