Publications by Type: Patent

2016
Method and Apparatus for Selecting Clusterings to Classify a Data Set
Gary King and Justin Grimmer. 12/13/2016. “Method and Apparatus for Selecting Clusterings to Classify a Data Set.” United States of America 9,519,705 B2 (Patent and Trademark Office). Abstract

In a computer assisted clustering method, a clustering space is generated from fixed basis partitiions that embed the entire space of all possible clusterings. A lower dimensional clustering space is created from the space of all possible clusterings by isometrically embedding the space of all possible clusterings in a lower dimensional Euclidean space. This lower dimensional space is then sampled based on the number of documents in the corpus. Partitions are then developed based on the samples that tessellate the space. Finally, using clusterings representative of these tessellations, a two-dimensional representation for users to explore is created.

Patent
Cross-Classroom and Cross-Institution Item Validation
Gary King, Brian Lukoff, and Eric Mazur. 11/29/2016. “Cross-Classroom and Cross-Institution Item Validation.” United States of America 9,508,266 (US Patent and Trademark Office). Abstract

Anonymous pretesting items for subsequent presentation to participants in a group enable an instructor to validate responses and revise the items accordingly. ... The present invention facilitates anonymous pretesting of items in classrooms (and/or other similar settings) to which the item author has no direct access or knowledge. In some enbodiments, pretesting is performed by software used by the instructor/author in his or her own classroom for other tasks. In various implementations, the software shares information with a central clearninghouse anonymously. The central clearinghouse then automatically matches students in the instructor's class with "relevant" students from other classes -- e.g., students that a statistical algorithm predicts will have approximately the same understanding, and will give approximately the same answers, as the instructor's class. ...

Patent
Systems and methods for calculating category proportions
Aykut Firat, Mitchell Brooks, Christopher Bingham, Amac Herdagdelen, and Gary King. 11/1/2016. “Systems and methods for calculating category proportions.” United States of America 9,483,544 (U.S. Patent and Trademark Office). Abstract

Systems and methods are provided for classifying text based on language using one or more computer servers and storage devices. A computer-implemented method includes receiving a training set of elements, each element in the training set being assigned to one of a plurality of categories and having one of a plurality of content profiles associated therewith; receiving a population set of elements, each element in the population set having one of the plurality of content profiles associated therewith; and calculating using at least one of a stacked regression algorithm, a bias formula algorithm, a noise elimination algorithm, and an ensemble method consisting of a plurality of algorithmic methods the results of which are averaged, based on the content profiles associated with and the categories assigned to elements in the training set and the content profiles associated with the elements of the population set, a distribution of elements of the population set over the categories.

Patent
2014
Participant Grouping for Enhanced Interactive Experience
Gary King, Brian Lukoff, and Eric Mazur. 2014. “Participant Grouping for Enhanced Interactive Experience.” United States of America US 8,914,373 B2 (U.S. Patent and Trademark Office). Abstract

Representative embodiments of a method for grouping participants in an activity include the steps of: (i) defining a grouping policy; (ii) storing, in a database, participant records that include a participant identifer, a characteristic associated With the participant, and/or an identifier for a participant’s handheld device; (iii) defining groupings based on the policy and characteristics of the participants relating to the policy and to the activity; and (iv) communicating the groupings to the handheld devices to establish the groups.

Patent
2013
Method and Apparatus for Selecting Clusterings to Classify A Predetermined Data Set
Gary King and Justin Grimmer. 2013. “Method and Apparatus for Selecting Clusterings to Classify A Predetermined Data Set.” United States of America 8,438,162 (May 7). Abstract

A method for selecting clusterings to classify a predetermined data set of numerical data comprises five steps. First, a plurality of known clustering methods are applied, one at a time, to the data set to generate clusterings for each method. Second, a metric space of clusterings is generated using a metric that measures the similarity between two clusterings. Third, the metric space is projected to a lower dimensional representation useful for visualization. Fourth, a “local cluster ensemble” method generates a clustering for each point in the lower dimensional space. Fifth, an animated visualization method uses the output of the local cluster ensemble method to display the lower dimensional space and to allow a user to move around and explore the space of clustering.

Patent
2012
System for Estimating a Distribution of Message Content Categories in Source Data
Daniel Hopkins, Gary King, and Ying Lu. 2012. “System for Estimating a Distribution of Message Content Categories in Source Data.” United States of America 8180717 (May 15). Abstract

A method of computerized content analysis that gives “approximately unbiased and statistically consistent estimates” of a distribution of elements of structured, unstructured, and partially structured source data among a set of categories. In one embodiment, this is done by analyzing a distribution of small set of individually-classified elements in a plurality of categories and then using the information determined from the analysis to extrapolate a distribution in a larger population set. This extrapolation is performed without constraining the distribution of the unlabeled elements to be equal to the distribution of labeled elements, nor constraining a content distribution of content of elements in the labeled set (e.g., a distribution of words used by elements in the labeled set) to be equal to a content distribution of elements in the unlabeled set. Not being constrained in these ways allows the estimation techniques described herein to provide distinct advantages over conventional aggregation techniques.

Patent